0x01 前言
我终于要淘汰了我的HP DL380 G6,因为需要控制用电量,我决定将380 G6下架并送给一位朋友。在送出去之前,我再进行一次性能测试,留下美好的回忆。
0x02 测试内容
目前服务器内有4块硬盘,全都是HP的300G硬盘;18根DDR3 ECC REG 4G内存。
目前磁盘的整列级别为RAID 5,我会装上Centos 7,分别测试磁盘与整体性能;然后重新配置RAID为10,再重复以上测试。
完成以上测试后会进行用电量测试,首先是在禁用Power Capping的情况下分别记录0负载与100%负载的用电情况;然后根据之前测试的最高电功率配置Power Capping为60%,再重复100%负载的测试与整体性能测试,以便比对用电量与整体分数。
简单来说,测试内容有以下几项:
- RAID 5的环境中测试磁盘性能;
- RAID 10的环境中测试磁盘性能;
- 禁用Power Capping,记录0负载时的功率;
- 禁用Power Capping,记录100%负载时的功率;
- 禁用Power Capping,测试整体性能。
- 设置Power Capping为50%,记录100%负载时的功率;
- 设置Power Capping为50%,测试整体性能。
首先需要确认硬件是否处于正常状态:
除了有一个散热风扇处于离线状态外,还有因为一块磁盘接触不良而离线导致整列处于重建阶段。
等重建完成后再开始测试。
最后,以下是我服务器的硬件信息:
- CPU:2颗L5630
- 内存:12根DDR3 ECC REG 4G
- 硬盘:4个300G HP迅猛龙硬盘
0x03 测试
0x03.1 RAID 5的环境中测试磁盘性能
[root@380p1 temp]# dd bs=1M count=1024 if=/dev/zero of=1gb-1.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 3.79824 s, 283 MB/s [root@380p1 temp]# dd bs=1M count=1024 if=/dev/zero of=1gb-2.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 3.64858 s, 294 MB/s [root@380p1 temp]# dd bs=1M count=1024 if=/dev/zero of=1gb-3.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 3.75337 s, 286 MB/s [root@380p1 temp]# dd bs=1M count=1024 if=/dev/zero of=1gb-4.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 3.77373 s, 285 MB/s [root@380p1 temp]# dd bs=1M count=1024 if=/dev/zero of=1gb-5.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 3.69217 s, 291 MB/s
通过dd命令测试后发现磁盘速度大概在280m/s左右。
4k随机读写测试:
[root@380p1 ~]# fio -filename=/root/bench/test.fio -direct=1 -rw=rw -bs=4k -size 1G -numjobs=8 -runtime=30 -group_reporting -name=file file: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1 ... fio-3.6-31-g4386 Starting 8 processes file: Laying out IO file (1 file / 1024MiB) Jobs: 8 (f=6): [M(4),f(2),M(2)][100.0%][r=110MiB/s,w=110MiB/s][r=28.1k,w=28.3k IOPS][eta 00m:00s] file: (groupid=0, jobs=8): err= 0: pid=31024: Sat May 12 17:31:49 2018 read: IOPS=28.8k, BW=112MiB/s (118MB/s)(3370MiB/30001msec) clat (usec): min=50, max=76565, avg=134.70, stdev=159.60 lat (usec): min=50, max=76565, avg=135.01, stdev=159.60 clat percentiles (usec): | 1.00th=[ 74], 5.00th=[ 86], 10.00th=[ 93], 20.00th=[ 102], | 30.00th=[ 109], 40.00th=[ 113], 50.00th=[ 117], 60.00th=[ 123], | 70.00th=[ 131], 80.00th=[ 143], 90.00th=[ 163], 95.00th=[ 190], | 99.00th=[ 570], 99.50th=[ 693], 99.90th=[ 1516], 99.95th=[ 1614], | 99.99th=[ 2180] bw ( KiB/s): min=11144, max=17628, per=12.51%, avg=14385.93, stdev=1033.69, samples=473 iops : min= 2786, max= 4407, avg=3596.46, stdev=258.42, samples=473 write: IOPS=28.8k, BW=112MiB/s (118MB/s)(3370MiB/30001msec) clat (usec): min=52, max=32825, avg=137.85, stdev=154.82 lat (usec): min=52, max=32825, avg=138.30, stdev=154.82 clat percentiles (usec): | 1.00th=[ 77], 5.00th=[ 89], 10.00th=[ 96], 20.00th=[ 105], | 30.00th=[ 111], 40.00th=[ 115], 50.00th=[ 120], 60.00th=[ 126], | 70.00th=[ 135], 80.00th=[ 145], 90.00th=[ 165], 95.00th=[ 194], | 99.00th=[ 578], 99.50th=[ 701], 99.90th=[ 1532], 99.95th=[ 1631], | 99.99th=[ 2212] bw ( KiB/s): min=11080, max=17440, per=12.51%, avg=14387.64, stdev=1024.63, samples=473 iops : min= 2770, max= 4360, avg=3596.89, stdev=256.15, samples=473 lat (usec) : 100=15.46%, 250=81.85%, 500=0.78%, 750=1.50%, 1000=0.11% lat (msec) : 2=0.29%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01% cpu : usr=3.47%, sys=23.50%, ctx=1782799, majf=0, minf=935 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=862631,862839,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=112MiB/s (118MB/s), 112MiB/s-112MiB/s (118MB/s-118MB/s), io=3370MiB (3533MB), run=30001-30001msec WRITE: bw=112MiB/s (118MB/s), 112MiB/s-112MiB/s (118MB/s-118MB/s), io=3370MiB (3534MB), run=30001-30001msec Disk stats (read/write): dm-1: ios=856135/856244, merge=0/0, ticks=98174/100093, in_queue=210473, util=100.00%, aggrios=862631/862871, aggrmerge=0/18, aggrticks=98523/100457, aggrin_queue=198595, aggrutil=99.74% sda: ios=862631/862871, merge=0/18, ticks=98523/100457, in_queue=198595, util=99.74%
速度在110m/s左右,看来磁盘还是挺不错的。
32k随机读写测试:
[root@380p1 ~]# fio -filename=/root/bench/test.fio -direct=1 -rw=rw -bs=32k -size 1G -numjobs=8 -runtime=30 -group_reporting -name=file file: (g=0): rw=rw, bs=(R) 32.0KiB-32.0KiB, (W) 32.0KiB-32.0KiB, (T) 32.0KiB-32.0KiB, ioengine=psync, iodepth=1 ... fio-3.6-31-g4386 Starting 8 processes Jobs: 8 (f=8): [M(8)][100.0%][r=544MiB/s,w=544MiB/s][r=17.4k,w=17.4k IOPS][eta 00m:00s] file: (groupid=0, jobs=8): err= 0: pid=31050: Sat May 12 17:33:28 2018 read: IOPS=17.8k, BW=556MiB/s (583MB/s)(4094MiB/7359msec) clat (usec): min=86, max=10778, avg=215.04, stdev=118.55 lat (usec): min=86, max=10779, avg=215.34, stdev=118.55 clat percentiles (usec): | 1.00th=[ 127], 5.00th=[ 147], 10.00th=[ 161], 20.00th=[ 172], | 30.00th=[ 180], 40.00th=[ 186], 50.00th=[ 192], 60.00th=[ 200], | 70.00th=[ 210], 80.00th=[ 227], 90.00th=[ 258], 95.00th=[ 306], | 99.00th=[ 758], 99.50th=[ 816], 99.90th=[ 1205], 99.95th=[ 1549], | 99.99th=[ 3359] bw ( KiB/s): min=58688, max=78336, per=12.79%, avg=72894.37, stdev=3584.25, samples=112 iops : min= 1834, max= 2448, avg=2277.93, stdev=112.01, samples=112 write: IOPS=17.8k, BW=557MiB/s (584MB/s)(4098MiB/7359msec) clat (usec): min=90, max=37591, avg=216.08, stdev=252.81 lat (usec): min=92, max=37593, avg=217.75, stdev=252.81 clat percentiles (usec): | 1.00th=[ 133], 5.00th=[ 151], 10.00th=[ 161], 20.00th=[ 172], | 30.00th=[ 180], 40.00th=[ 186], 50.00th=[ 192], 60.00th=[ 198], | 70.00th=[ 208], 80.00th=[ 225], 90.00th=[ 258], 95.00th=[ 306], | 99.00th=[ 758], 99.50th=[ 816], 99.90th=[ 1237], 99.95th=[ 1598], | 99.99th=[ 3720] bw ( KiB/s): min=61184, max=78656, per=12.78%, avg=72869.89, stdev=3722.37, samples=112 iops : min= 1912, max= 2458, avg=2277.16, stdev=116.33, samples=112 lat (usec) : 100=0.02%, 250=88.66%, 500=8.21%, 750=2.03%, 1000=0.93% lat (msec) : 2=0.12%, 4=0.03%, 10=0.01%, 20=0.01%, 50=0.01% cpu : usr=2.29%, sys=14.10%, ctx=267847, majf=0, minf=410 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=131019,131125,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=556MiB/s (583MB/s), 556MiB/s-556MiB/s (583MB/s-583MB/s), io=4094MiB (4293MB), run=7359-7359msec WRITE: bw=557MiB/s (584MB/s), 557MiB/s-557MiB/s (584MB/s-584MB/s), io=4098MiB (4297MB), run=7359-7359msec Disk stats (read/write): dm-1: ios=129689/129818, merge=0/0, ticks=25758/25617, in_queue=54558, util=99.86%, aggrios=131019/131125, aggrmerge=0/0, aggrticks=25912/25727, aggrin_queue=51521, aggrutil=98.27% sda: ios=131019/131125, merge=0/0, ticks=25912/25727, in_queue=51521, util=98.27%
我想这个速度应该是被缓存卡缓存了,在580m/s左右。
1m随机读写测试:
[root@380p1 ~]# fio -filename=/root/bench/test.fio -direct=1 -rw=rw -bs=1m -size 1G -numjobs=8 -runtime=30 -group_reporting -name=file file: (g=0): rw=rw, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1 ... fio-3.6-31-g4386 Starting 8 processes Jobs: 5 (f=5): [M(1),_(1),M(4),_(2)][100.0%][r=444MiB/s,w=477MiB/s][r=444,w=477 IOPS][eta 00m:00s] file: (groupid=0, jobs=8): err= 0: pid=31076: Sat May 12 17:34:34 2018 read: IOPS=334, BW=334MiB/s (351MB/s)(4012MiB/12000msec) clat (usec): min=1019, max=1587.4k, avg=14413.94, stdev=64657.06 lat (usec): min=1019, max=1587.4k, avg=14414.43, stdev=64657.06 clat percentiles (usec): | 1.00th=[ 1532], 5.00th=[ 2409], 10.00th=[ 3195], | 20.00th=[ 3851], 30.00th=[ 4359], 40.00th=[ 4686], | 50.00th=[ 5014], 60.00th=[ 5342], 70.00th=[ 5735], | 80.00th=[ 6325], 90.00th=[ 7898], 95.00th=[ 45351], | 99.00th=[ 210764], 99.50th=[ 392168], 99.90th=[1249903], | 99.95th=[1249903], 99.99th=[1585447] bw ( KiB/s): min= 2043, max=100151, per=13.03%, avg=44609.24, stdev=23766.26, samples=174 iops : min= 1, max= 97, avg=43.47, stdev=23.19, samples=174 write: IOPS=348, BW=348MiB/s (365MB/s)(4180MiB/12000msec) clat (usec): min=1160, max=1255.9k, avg=8809.32, stdev=41022.78 lat (usec): min=1211, max=1255.0k, avg=8865.73, stdev=41022.29 clat percentiles (usec): | 1.00th=[ 1434], 5.00th=[ 2278], 10.00th=[ 2737], | 20.00th=[ 3326], 30.00th=[ 3720], 40.00th=[ 4047], | 50.00th=[ 4359], 60.00th=[ 4686], 70.00th=[ 5014], | 80.00th=[ 5407], 90.00th=[ 6063], 95.00th=[ 8586], | 99.00th=[ 112722], 99.50th=[ 179307], 99.90th=[ 517997], | 99.95th=[1115685], 99.99th=[1249903] bw ( KiB/s): min= 2043, max=108544, per=13.19%, avg=47057.34, stdev=25265.67, samples=172 iops : min= 1, max= 106, avg=45.86, stdev=24.66, samples=172 lat (msec) : 2=2.66%, 4=28.05%, 10=63.48%, 20=1.33%, 50=0.62% lat (msec) : 100=1.67%, 250=1.70%, 500=0.27%, 750=0.12% cpu : usr=0.32%, sys=1.06%, ctx=9070, majf=0, minf=273 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=4012,4180,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=334MiB/s (351MB/s), 334MiB/s-334MiB/s (351MB/s-351MB/s), io=4012MiB (4207MB), run=12000-12000msec WRITE: bw=348MiB/s (365MB/s), 348MiB/s-348MiB/s (365MB/s-365MB/s), io=4180MiB (4383MB), run=12000-12000msec Disk stats (read/write): dm-1: ios=7908/8272, merge=0/0, ticks=114032/72671, in_queue=188677, util=99.26%, aggrios=8024/8360, aggrmerge=0/0, aggrticks=114610/73023, aggrin_queue=187594, aggrutil=99.04% sda: ios=8024/8360, merge=0/0, ticks=114610/73023, in_queue=187594, util=99.04%
这个速度也很高,在350m/s左右。
0x03.2 RAID 10的环境中测试磁盘性能
[root@380G6 ~]# dd bs=1M count=1024 if=/dev/zero of=1gb-1.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 4.7334 s, 227 MB/s [root@380G6 ~]# dd bs=1M count=1024 if=/dev/zero of=1gb-2.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 4.78248 s, 225 MB/s [root@380G6 ~]# dd bs=1M count=1024 if=/dev/zero of=1gb-3.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 4.66221 s, 230 MB/s [root@380G6 ~]# dd bs=1M count=1024 if=/dev/zero of=1gb-4.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 4.6648 s, 230 MB/s [root@380G6 ~]# dd bs=1M count=1024 if=/dev/zero of=1gb-5.test conv=fdatasync 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 4.64573 s, 231 MB/s
RAID 10的结果比RAID 5的280m/s略胜一筹。
4k随机读写测试:
[root@380G6 ~]# fio -filename=/root/bench/test.fio -direct=1 -rw=rw -bs=4k -size 1G -numjobs=8 -runtime=30 -group_reporting -name=file file: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1 ... fio-3.6-31-g4386 Starting 8 processes file: Laying out IO file (1 file / 1024MiB) Jobs: 8 (f=8): [M(8)][100.0%][r=101MiB/s,w=101MiB/s][r=25.9k,w=25.9k IOPS][eta 00m:00s] file: (groupid=0, jobs=8): err= 0: pid=31269: Sat May 12 19:46:43 2018 read: IOPS=25.0k, BW=97.7MiB/s (102MB/s)(2931MiB/30001msec) clat (usec): min=52, max=133121, avg=154.73, stdev=214.23 lat (usec): min=52, max=133122, avg=155.08, stdev=214.24 clat percentiles (usec): | 1.00th=[ 76], 5.00th=[ 91], 10.00th=[ 100], 20.00th=[ 113], | 30.00th=[ 124], 40.00th=[ 133], 50.00th=[ 141], 60.00th=[ 151], | 70.00th=[ 163], 80.00th=[ 180], 90.00th=[ 210], 95.00th=[ 243], | 99.00th=[ 578], 99.50th=[ 619], 99.90th=[ 685], 99.95th=[ 717], | 99.99th=[ 791] bw ( KiB/s): min= 9568, max=15784, per=12.48%, avg=12479.38, stdev=1116.74, samples=472 iops : min= 2392, max= 3946, avg=3119.81, stdev=279.17, samples=472 write: IOPS=25.0k, BW=97.7MiB/s (102MB/s)(2931MiB/30001msec) clat (usec): min=52, max=90265, avg=158.94, stdev=153.77 lat (usec): min=52, max=90265, avg=159.43, stdev=153.78 clat percentiles (usec): | 1.00th=[ 79], 5.00th=[ 95], 10.00th=[ 103], 20.00th=[ 117], | 30.00th=[ 128], 40.00th=[ 137], 50.00th=[ 145], 60.00th=[ 155], | 70.00th=[ 167], 80.00th=[ 184], 90.00th=[ 215], 95.00th=[ 249], | 99.00th=[ 586], 99.50th=[ 627], 99.90th=[ 693], 99.95th=[ 717], | 99.99th=[ 775] bw ( KiB/s): min= 9424, max=16128, per=12.48%, avg=12478.56, stdev=1104.97, samples=472 iops : min= 2356, max= 4032, avg=3119.61, stdev=276.24, samples=472 lat (usec) : 100=8.84%, 250=86.54%, 500=3.05%, 750=1.55%, 1000=0.02% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01% cpu : usr=3.13%, sys=25.26%, ctx=1566610, majf=0, minf=895 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=750249,750239,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=97.7MiB/s (102MB/s), 97.7MiB/s-97.7MiB/s (102MB/s-102MB/s), io=2931MiB (3073MB), run=30001-30001msec WRITE: bw=97.7MiB/s (102MB/s), 97.7MiB/s-97.7MiB/s (102MB/s-102MB/s), io=2931MiB (3073MB), run=30001-30001msec Disk stats (read/write): dm-0: ios=745867/745800, merge=0/0, ticks=99928/101713, in_queue=222033, util=100.00%, aggrios=750249/750287, aggrmerge=0/19, aggrticks=100079/101811, aggrin_queue=201957, aggrutil=99.77% sda: ios=750249/750287, merge=0/19, ticks=100079/101811, in_queue=201957, util=99.77%
RAID 10的4K结果为100m/s左右,较RAID 5的110m/s少了一点。
32k随机读写测试:
[root@380G6 ~]# fio -filename=/root/bench/test.fio -direct=1 -rw=rw -bs=32k -size 1G -numjobs=8 -runtime=30 -group_reporting -name=file file: (g=0): rw=rw, bs=(R) 32.0KiB-32.0KiB, (W) 32.0KiB-32.0KiB, (T) 32.0KiB-32.0KiB, ioengine=psync, iodepth=1 ... fio-3.6-31-g4386 Starting 8 processes Jobs: 8 (f=8): [M(8)][100.0%][r=559MiB/s,w=557MiB/s][r=17.9k,w=17.8k IOPS][eta 00m:00s] file: (groupid=0, jobs=8): err= 0: pid=31295: Sat May 12 19:48:03 2018 read: IOPS=17.7k, BW=553MiB/s (580MB/s)(4094MiB/7404msec) clat (usec): min=82, max=100901, avg=221.53, stdev=741.09 lat (usec): min=82, max=100902, avg=221.87, stdev=741.09 clat percentiles (usec): | 1.00th=[ 119], 5.00th=[ 145], 10.00th=[ 161], 20.00th=[ 176], | 30.00th=[ 184], 40.00th=[ 192], 50.00th=[ 198], 60.00th=[ 206], | 70.00th=[ 217], 80.00th=[ 231], 90.00th=[ 262], 95.00th=[ 310], | 99.00th=[ 676], 99.50th=[ 725], 99.90th=[ 816], 99.95th=[ 865], | 99.99th=[38536] bw ( KiB/s): min=47744, max=81920, per=12.67%, avg=71730.44, stdev=7011.10, samples=112 iops : min= 1492, max= 2560, avg=2241.56, stdev=219.10, samples=112 write: IOPS=17.7k, BW=553MiB/s (580MB/s)(4098MiB/7404msec) clat (usec): min=87, max=104739, avg=216.55, stdev=563.24 lat (usec): min=88, max=104740, avg=218.33, stdev=563.24 clat percentiles (usec): | 1.00th=[ 123], 5.00th=[ 149], 10.00th=[ 161], 20.00th=[ 174], | 30.00th=[ 184], 40.00th=[ 190], 50.00th=[ 198], 60.00th=[ 204], | 70.00th=[ 215], 80.00th=[ 227], 90.00th=[ 260], 95.00th=[ 310], | 99.00th=[ 676], 99.50th=[ 717], 99.90th=[ 807], 99.95th=[ 840], | 99.99th=[ 963] bw ( KiB/s): min=48448, max=82688, per=12.66%, avg=71724.04, stdev=7205.60, samples=112 iops : min= 1514, max= 2584, avg=2241.37, stdev=225.18, samples=112 lat (usec) : 100=0.27%, 250=87.72%, 500=9.76%, 750=1.96%, 1000=0.28% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01% cpu : usr=2.50%, sys=15.26%, ctx=268790, majf=0, minf=406 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=131019,131125,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=553MiB/s (580MB/s), 553MiB/s-553MiB/s (580MB/s-580MB/s), io=4094MiB (4293MB), run=7404-7404msec WRITE: bw=553MiB/s (580MB/s), 553MiB/s-553MiB/s (580MB/s-580MB/s), io=4098MiB (4297MB), run=7404-7404msec Disk stats (read/write): dm-0: ios=130961/131067, merge=0/0, ticks=26466/25721, in_queue=56508, util=100.00%, aggrios=131019/131125, aggrmerge=0/0, aggrticks=26409/25691, aggrin_queue=52085, aggrutil=98.46% sda: ios=131019/131125, merge=0/0, ticks=26409/25691, in_queue=52085, util=98.46%
RAID 10的32K随机读写速度与RAID 5的相差不多。
1m随机读写测试:
[root@380G6 ~]# fio -filename=/root/bench/test.fio -direct=1 -rw=rw -bs=1m -size 1G -numjobs=8 -runtime=30 -group_reporting -name=file file: (g=0): rw=rw, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1 ... fio-3.6-31-g4386 Starting 8 processes Jobs: 5 (f=5): [M(2),_(1),M(2),_(1),M(1),_(1)][92.3%][r=435MiB/s,w=469MiB/s][r=435,w=469 IOPS][eta 00m:01s] file: (groupid=0, jobs=8): err= 0: pid=31322: Sat May 12 19:49:10 2018 read: IOPS=330, BW=330MiB/s (346MB/s)(4012MiB/12156msec) clat (usec): min=854, max=373553, avg=13982.25, stdev=34835.55 lat (usec): min=855, max=373554, avg=13982.74, stdev=34835.54 clat percentiles (msec): | 1.00th=[ 3], 5.00th=[ 4], 10.00th=[ 5], 20.00th=[ 5], | 30.00th=[ 6], 40.00th=[ 6], 50.00th=[ 6], 60.00th=[ 7], | 70.00th=[ 7], 80.00th=[ 7], 90.00th=[ 8], 95.00th=[ 65], | 99.00th=[ 203], 99.50th=[ 230], 99.90th=[ 313], 99.95th=[ 355], | 99.99th=[ 376] bw ( KiB/s): min=10240, max=83968, per=12.29%, avg=41531.97, stdev=16267.37, samples=186 iops : min= 10, max= 82, avg=40.46, stdev=15.87, samples=186 write: IOPS=343, BW=344MiB/s (361MB/s)(4180MiB/12156msec) clat (usec): min=894, max=356092, avg=9500.70, stdev=24636.85 lat (usec): min=918, max=356152, avg=9556.35, stdev=24637.04 clat percentiles (msec): | 1.00th=[ 3], 5.00th=[ 4], 10.00th=[ 5], 20.00th=[ 5], | 30.00th=[ 6], 40.00th=[ 6], 50.00th=[ 6], 60.00th=[ 7], | 70.00th=[ 7], 80.00th=[ 7], 90.00th=[ 8], 95.00th=[ 8], | 99.00th=[ 153], 99.50th=[ 197], 99.90th=[ 305], 99.95th=[ 355], | 99.99th=[ 355] bw ( KiB/s): min= 6144, max=107536, per=12.31%, avg=43361.76, stdev=18967.14, samples=186 iops : min= 6, max= 105, avg=42.25, stdev=18.50, samples=186 lat (usec) : 1000=0.09% lat (msec) : 2=0.48%, 4=5.11%, 10=88.23%, 20=0.45%, 50=1.11% lat (msec) : 100=2.10%, 250=2.21%, 500=0.22% cpu : usr=0.34%, sys=0.97%, ctx=8968, majf=0, minf=270 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=4012,4180,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=330MiB/s (346MB/s), 330MiB/s-330MiB/s (346MB/s-346MB/s), io=4012MiB (4207MB), run=12156-12156msec WRITE: bw=344MiB/s (361MB/s), 344MiB/s-344MiB/s (361MB/s-361MB/s), io=4180MiB (4383MB), run=12156-12156msec Disk stats (read/write): dm-0: ios=7998/8337, merge=0/0, ticks=111399/78702, in_queue=190413, util=99.36%, aggrios=8024/8361, aggrmerge=0/0, aggrticks=111418/78705, aggrin_queue=190096, aggrutil=99.05% sda: ios=8024/8361, merge=0/0, ticks=111418/78705, in_queue=190096, util=99.05%
RAID 10的1m随机读写速度与RAID 5的相差不多。
0x03.3 禁用Power Capping时的整体性能
------------------------------------------------------------------------ Benchmark Run: Sat May 12 2018 21:30:31 - 21:58:36 16 CPUs in system; running 1 parallel copy of tests Dhrystone 2 using register variables 23806589.8 lps (10.0 s, 7 samples) Double-Precision Whetstone 3143.5 MWIPS (9.3 s, 7 samples) Execl Throughput 1592.1 lps (29.6 s, 2 samples) File Copy 1024 bufsize 2000 maxblocks 477843.9 KBps (30.0 s, 2 samples) File Copy 256 bufsize 500 maxblocks 126115.2 KBps (30.0 s, 2 samples) File Copy 4096 bufsize 8000 maxblocks 1257179.0 KBps (30.0 s, 2 samples) Pipe Throughput 589279.2 lps (10.0 s, 7 samples) Pipe-based Context Switching 108094.4 lps (10.0 s, 7 samples) Process Creation 3551.7 lps (30.0 s, 2 samples) Shell Scripts (1 concurrent) 3369.7 lpm (60.0 s, 2 samples) Shell Scripts (8 concurrent) 2086.9 lpm (60.0 s, 2 samples) System Call Overhead 557238.8 lps (10.0 s, 7 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 23806589.8 2040.0 Double-Precision Whetstone 55.0 3143.5 571.5 Execl Throughput 43.0 1592.1 370.3 File Copy 1024 bufsize 2000 maxblocks 3960.0 477843.9 1206.7 File Copy 256 bufsize 500 maxblocks 1655.0 126115.2 762.0 File Copy 4096 bufsize 8000 maxblocks 5800.0 1257179.0 2167.5 Pipe Throughput 12440.0 589279.2 473.7 Pipe-based Context Switching 4000.0 108094.4 270.2 Process Creation 126.0 3551.7 281.9 Shell Scripts (1 concurrent) 42.4 3369.7 794.7 Shell Scripts (8 concurrent) 6.0 2086.9 3478.1 System Call Overhead 15000.0 557238.8 371.5 ======== System Benchmarks Index Score 750.4 ------------------------------------------------------------------------ Benchmark Run: Sat May 12 2018 21:58:36 - 22:26:25 16 CPUs in system; running 16 parallel copies of tests Dhrystone 2 using register variables 190796296.4 lps (10.0 s, 7 samples) Double-Precision Whetstone 38221.2 MWIPS (9.7 s, 7 samples) Execl Throughput 22683.0 lps (30.0 s, 2 samples) File Copy 1024 bufsize 2000 maxblocks 662626.5 KBps (30.0 s, 2 samples) File Copy 256 bufsize 500 maxblocks 187366.5 KBps (30.0 s, 2 samples) File Copy 4096 bufsize 8000 maxblocks 1944384.0 KBps (30.0 s, 2 samples) Pipe Throughput 6164645.5 lps (10.0 s, 7 samples) Pipe-based Context Switching 1386729.5 lps (10.0 s, 7 samples) Process Creation 59669.5 lps (30.0 s, 2 samples) Shell Scripts (1 concurrent) 36711.1 lpm (60.0 s, 2 samples) Shell Scripts (8 concurrent) 4942.1 lpm (60.1 s, 2 samples) System Call Overhead 4534713.2 lps (10.0 s, 7 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 190796296.4 16349.3 Double-Precision Whetstone 55.0 38221.2 6949.3 Execl Throughput 43.0 22683.0 5275.1 File Copy 1024 bufsize 2000 maxblocks 3960.0 662626.5 1673.3 File Copy 256 bufsize 500 maxblocks 1655.0 187366.5 1132.1 File Copy 4096 bufsize 8000 maxblocks 5800.0 1944384.0 3352.4 Pipe Throughput 12440.0 6164645.5 4955.5 Pipe-based Context Switching 4000.0 1386729.5 3466.8 Process Creation 126.0 59669.5 4735.7 Shell Scripts (1 concurrent) 42.4 36711.1 8658.3 Shell Scripts (8 concurrent) 6.0 4942.1 8236.9 System Call Overhead 15000.0 4534713.2 3023.1 ======== System Benchmarks Index Score 4487.9
0x03.4 Power Capping为50%时的整体性能
上图为设置示意图,下方为结果:
------------------------------------------------------------------------ Benchmark Run: Sat May 12 2018 22:29:30 - 22:57:35 16 CPUs in system; running 1 parallel copy of tests Dhrystone 2 using register variables 23745288.4 lps (10.0 s, 7 samples) Double-Precision Whetstone 3143.5 MWIPS (9.3 s, 7 samples) Execl Throughput 1628.4 lps (29.7 s, 2 samples) File Copy 1024 bufsize 2000 maxblocks 461504.0 KBps (30.0 s, 2 samples) File Copy 256 bufsize 500 maxblocks 124107.0 KBps (30.0 s, 2 samples) File Copy 4096 bufsize 8000 maxblocks 1227233.2 KBps (30.0 s, 2 samples) Pipe Throughput 585536.5 lps (10.0 s, 7 samples) Pipe-based Context Switching 108645.4 lps (10.0 s, 7 samples) Process Creation 3529.8 lps (30.0 s, 2 samples) Shell Scripts (1 concurrent) 3356.0 lpm (60.0 s, 2 samples) Shell Scripts (8 concurrent) 2073.6 lpm (60.0 s, 2 samples) System Call Overhead 526798.7 lps (10.0 s, 7 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 23745288.4 2034.7 Double-Precision Whetstone 55.0 3143.5 571.5 Execl Throughput 43.0 1628.4 378.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 461504.0 1165.4 File Copy 256 bufsize 500 maxblocks 1655.0 124107.0 749.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 1227233.2 2115.9 Pipe Throughput 12440.0 585536.5 470.7 Pipe-based Context Switching 4000.0 108645.4 271.6 Process Creation 126.0 3529.8 280.1 Shell Scripts (1 concurrent) 42.4 3356.0 791.5 Shell Scripts (8 concurrent) 6.0 2073.6 3456.0 System Call Overhead 15000.0 526798.7 351.2 ======== System Benchmarks Index Score 742.4 ------------------------------------------------------------------------ Benchmark Run: Sat May 12 2018 22:57:35 - 23:25:53 16 CPUs in system; running 16 parallel copies of tests Dhrystone 2 using register variables 116011916.0 lps (10.0 s, 7 samples) Double-Precision Whetstone 26322.8 MWIPS (11.2 s, 7 samples) Execl Throughput 13840.6 lps (29.9 s, 2 samples) File Copy 1024 bufsize 2000 maxblocks 413571.7 KBps (30.0 s, 2 samples) File Copy 256 bufsize 500 maxblocks 113527.7 KBps (30.0 s, 2 samples) File Copy 4096 bufsize 8000 maxblocks 1172075.9 KBps (30.0 s, 2 samples) Pipe Throughput 4056963.1 lps (10.0 s, 7 samples) Pipe-based Context Switching 948133.5 lps (10.0 s, 7 samples) Process Creation 31880.5 lps (30.0 s, 2 samples) Shell Scripts (1 concurrent) 19067.8 lpm (60.0 s, 2 samples) Shell Scripts (8 concurrent) 2410.4 lpm (60.2 s, 2 samples) System Call Overhead 3354733.9 lps (10.0 s, 7 samples) System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 116011916.0 9941.0 Double-Precision Whetstone 55.0 26322.8 4786.0 Execl Throughput 43.0 13840.6 3218.7 File Copy 1024 bufsize 2000 maxblocks 3960.0 413571.7 1044.4 File Copy 256 bufsize 500 maxblocks 1655.0 113527.7 686.0 File Copy 4096 bufsize 8000 maxblocks 5800.0 1172075.9 2020.8 Pipe Throughput 12440.0 4056963.1 3261.2 Pipe-based Context Switching 4000.0 948133.5 2370.3 Process Creation 126.0 31880.5 2530.2 Shell Scripts (1 concurrent) 42.4 19067.8 4497.1 Shell Scripts (8 concurrent) 6.0 2410.4 4017.4 System Call Overhead 15000.0 3354733.9 2236.5 ======== System Benchmarks Index Score 2735.0
单线程的评分没差别,但是多线程的评分腰斩了,只有2735.0。
从下面的图片可以看到,设定了Power Capping之后确实可以很好地控制功率,但是性能也随之下降。在实际使用过程中可以根据情况进行调控:
经过监测,得出以下功率数值:
- 0负载的功耗为:150W左右
- 100%负载时的功耗为270W左右
0x04 结语
因为我的硬盘和内存都没有满配,同时CPU也是低电压的版本,所以这个功耗值并不代表其他配置的情况。