Match: Format: Sort by:
Search:

TL2000 Dell PowerVault Tape Library

TL2000_PowerVault_User_Guide_UG_en (PDF)

Conlusions from Performance Tests

This started as testing the TL2000 tape device, and ended up with seeking to opimise the read performance of the Dell R510 server.

For the Dell R510 to optimmise read performance of home directories

"Slow" Disks

This was not expected. The disks originally in maris were slower than both desiree and cabbage. The "slow" disks are now in desiree. The slowness was about 20% of performance when measured using RAID 10 and the tar read of home directories performance test.

RAID 10 over 10 disks, with 2 Hot Spares

RAID card:
Stripe size of 1024k (1M) = maximum
Default settings for read and write. I think these are Adaptive Read Ahead
and Write Back.

LVM:
standard lvm setup as documented.
and make sure to get the alignment right.
yum  --enablerepo=dag install gdisk
gdisk /dev/sdb
pvcreate -v /dev/sdb1
VGNAME=$(perl -e 'use POSIX; $v=(uname())[1];$v=~ s/\..*//;print ucfirst($v)."Span0\n"')
vgcreate -v $VGNAME --physicalextentsize 32M /dev/sdb1
lvcreate -l 100%VG --name data -v $VGNAME

File-system:
IDEV=/dev/$VGNAME/data
mkfs.xfs -d su=1024k,sw=5 -i attr=2 -l internal,version=2 $IDEV

fstab mount options:
defaults,nosuid,nodev,quota,noatime,logbufs=8,logbsize=256k

Note:
slight performance improvment without LVM, maybe 5%

RAID 6 over 11 disks with 1 hot spare

RAID card:
Stripe size of 64k = minimum
Default settings for read and write. I think these are Adaptive Read Ahead
and Write Back.

LVM:
standard lvm setup as documented.
and make sure to get the alignment right.
yum  --enablerepo=dag install gdisk
gdisk /dev/sdb
pvcreate -v /dev/sdb1
VGNAME=$(perl -e 'use POSIX; $v=(uname())[1];$v=~ s/\..*//;print ucfirst($v)."Span0\n"')
vgcreate -v $VGNAME --physicalextentsize 32M /dev/sdb1
lvcreate -l 100%VG --name data -v $VGNAME

File-system:
IDEV=/dev/$VGNAME/data
mkfs.xfs -d su=64k,sw=9 -i attr=2 -l internal,version=2 $IDEV

fstab mount options:
defaults,nosuid,nodev,quota,noatime,logbufs=8,logbsize=256k

Working Notes

Power on the TL2000

The device will be created on the attached server:
/dev/changer

Check status
mtx -f /dev/changer status

mtx -f /dev/changer status
  Storage Changer /dev/changer:1 Drives, 23 Slots ( 0 Import/Export )
Data Transfer Element 0:Empty
      Storage Element 1:Empty
      Storage Element 2:Empty
      Storage Element 3:Empty
      Storage Element 4:Empty
      Storage Element 5:Empty
      Storage Element 6:Empty
- snip -

Notes:
Data Transfer Element 0 is the tape drive

mtx -f /dev/changer inquiry
Product Type: Medium Changer
Vendor ID: 'IBM     '
Product ID: '3573-TL         '
Revision: '9.50'
Attached Changer: No

# Load media from slot 1 into drive:
$ mtx -f /dev/sg3 load 1
# status query includes:
Data Transfer Element 0:Full (Storage Element 1 Loaded)

# unload drive to slot 1 (default as not specified):
mtx -f /dev/changer unload
# load from slot 24:
mtx -f /dev/changer load 24

# test transfer of data to tape:
tar cpvf /dev/st0 ./howto/

# Errors you get if you try to write to /dev/changer or /dv/sg3
tar -cpvf /dev/sg3 ./howto/
tar: /dev/sg3: Cannot write: Cannot allocate memory

# table of contents:
tar tzvf /dev/st0

# Test backup of two copies of the DAMTP home directories:
cd /local/sync
# trigger a scan of the file system:
find . -iname "*rtljkhfgnalgnalegn*"
# then the actual tar of 1.3TB of files:
time tar cpf /dev/st0 ./cinghome ./cinghome2
real    1694m23.938s
user    6m53.381s
sys     55m43.215s
28.24 hours for 1.26TB of home directory data
(1260*1000)/(1694*60) = 12.4 MB/sec

# this time with compression:
time tar cpzf /dev/st0 ./cinghome ./cinghome2
real    2480m50.416s
user    1505m25.931s
sys     78m24.657s
(1260*1000)/(2480*60) = 8.5 MB/sec

# create a 128GB tar archive on disk:
maris:/local/sync/cinghome $ tar cpf /local/sync/test_data.tar ./gr ./chaos -b 300

# time the same tar to tape:
maris:/local/sync/cinghome $ time tar cpf /dev/st0 ./gr ./chaos -b 300
real    108m30.785s
user    0m14.866s
sys     3m5.396s
(128*1000)/(108*60) = 19.75 MB/sec

# time the same tar to /dev/null via dd
maris:/local/sync/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    90m31.963s
user    0m16.024s
sys     3m38.151s
(128*1000)/(90.5*60) = 23.57 MB/sec
# repeat:
real    89m46.543s
user    0m16.793s
sys     3m42.144s
(128*1000)/(89.75*60) = 23.77 MB/sec

# reboot and repeat


# same test on cabbage
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    84m10.258s
user    0m15.674s
sys     3m46.482s
(128*1000)/(84.22*60) = 25.33 MB/sec

# reboot and repeat
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    85m19.239s
user    0m15.635s
sys     3m36.686s
(128*1000)/(85.33*60) = 25 MB/sec
# reboot and mount with nodiratime option
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    85m27.705s
user    0m15.716s
sys     3m37.414s
= same

# try with RAID6
# desiree, RAID6 over 12 2TB HDDs
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    44m18.994s
user    0m14.170s
sys     3m31.057s
(128*1000)/(44.33*60) = 48.12 MB/sec

# reboot and repeat
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    47m34.463s
user    0m15.097s
sys     3m29.015s
(128*1000)/(47.5*60) = 44.91 MB/sec

# Now with other mount options (logbufs=8,logbsize=256k)
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    47m5.655s
user    0m14.659s
sys     3m33.329s
= same

# maris RAID10 with logbufs=8,logbsize=256k
maris:/local/sync/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    91m29.291s
user    0m16.067s
sys     3m41.533s
(128*1000)/(91.5*60) = 23.3 MB/sec

# RAID 10 on maris with nodiratime mount option for xfs
maris:/local/sync/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
# oops, reading more suggests that nodiratime is set when noatime is set

# xfs parameter changes
http://www.practicalsysadmin.com/wiki/index.php/XFS_optimisation
http://www.mythtv.org/wiki/XFS_Filesystem#Mounting_the_XFS_filesystem_with_high_performance_options
http://everything2.com/title/Filesystem+performance+tweaking+with+XFS+on+Linux
http://xfs.org/index.php/XFS_FAQ

# Perc performance tuning?
Suggests stripe size increase to 512k

# maris, remove noatime and nodiratime
maris:/local/sync/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    91m25.559s
user    0m16.183s
sys     3m40.502s
= same

# cabbbage, lazy-count=1 during mkfs.xfs
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    87m53.122s
user    0m16.089s
sys     3m44.112s
= same

# desiree, RAID6 over 11 disks (+ 1 HS)
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    46m23.132s
user    0m14.535s
sys     3m33.222s
(128*1000)/(46.4*60) = 45.97MB/sec

# simple write test
desiree:/local/backup $ time dd if=/dev/zero of=./2TB bs=1024k count=2048000
2048000+0 records in
2048000+0 records out
2147483648000 bytes (2.1 TB) copied, 4737.26 seconds, 453 MB/s
real    78m58.576s
user    0m2.875s
sys     44m37.443s

# same write test on maris with RAID10 over 10 disks (+ 2 HS)
2048000+0 records in
2048000+0 records out
2147483648000 bytes (2.1 TB) copied, 4044.37 seconds, 531 MB/s
real    67m25.275s
user    0m3.183s
sys     51m31.996s

# desiree
RAID6 (11 disks + 1 HS), stripe size 512k
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    58m54.944s
user    0m15.399s
sys     3m36.378s
(128*1000)/(59*60) = 36.15MB/sec
# simple write test
desiree:/local/backup $ time dd if=/dev/zero of=./2TB bs=1024k count=2048000
2147483648000 bytes (2.1 TB) copied, 2743.88 seconds, 783 MB/s

# cabbage
RAID10 (10 disks + 2 HS), stripe size 512k
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    34m33.936s
user    0m14.291s
sys     3m35.915s
(128*1000)/(34.5*60) = 61.83MB/sec
# repeat to check: 30 mins (even faster)
# simple write test
cabbage:/local/backup/cinghome $ time dd if=/dev/zero of=./2TB bs=1024k count=2048000
2147483648000 bytes (2.1 TB) copied, 4309.78 seconds, 498 MB/s
# reboot and run read test again, to be sure.
real    36m0.916s
= same

# try without LVM, mkfs.xfs directly onto partition
# cabbage, RAID 10 + 2 HS, 512k stripe width
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    32m1.775s
(128*1000)/(32*60) = 66.66MB/sec
# improved?
# reboot and re-run the test
real    33m27.284s
= conclude slight improvment without LVM, maybe 5%

# desiree, RAID6 + 1 HS, 512k stripe width
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    56m29.759s
(128*1000)/(56.5*60) = 37.76 MB/sec
# improved?
# reboot and re-run the test
real    57m1.217s
# slight improvement

# try with software RAID

# Other stripe sizes with each RAID level: 1MB, 256k

# cabbage, 1MB stripe width, RAID10 over 10 disks
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    27m2.647s
# reboot and try again.
real    27m39.247s
(128*1000)/(27*60) = 79 MB/sec

# write test
time dd if=/dev/zero of=./2TB bs=1024k count=2048000
2147483648000 bytes (2.1 TB) copied, 3991.52 seconds, 538 MB/s

# desiree, 1MB stripe width, RAID6 over 11 disks
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    59m40.819s
# reboot and retry
real    61m13.788s
# write test
time dd if=/dev/zero of=./2TB bs=1024k count=2048000
2147483648000 bytes (2.1 TB) copied, 2837.29 seconds, 757 MB/s

# cabbage, 256k stripe width, RAID10 over 10 disks
IDEV=/dev/sdb1
mkfs.xfs -d su=256k,sw=5 -i attr=2 -l internal,version=2 $IDEV
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    43m53.178s
time dd if=/dev/zero of=./2TB bs=1024k count=2048000
2147483648000 bytes (2.1 TB) copied, 4945.34 seconds, 434 MB/s

# desiree, 256k stripe width, RAID6 over 11 disks
IDEV=/dev/sdb1
mkfs.xfs -d su=256k,sw=9 -i attr=2 -l internal,version=2 $IDEV
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    46m27.165s
time dd if=/dev/zero of=./2TB bs=1024k count=2048000
2147483648000 bytes (2.1 TB) copied, 2960.77 seconds, 725 MB/s

# cabbage, software RAID10 over 10 disks
/sbin/mdadm --create --metadata=1.2 --verbose /dev/md0 --level=10 --raid-devices=10 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1
mkfs.xfs -d su=64k,sw=5 -i attr=2 -l internal,version=2 /dev/md0
cabbage:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    88m34.933s

# desiree, software RAID6 over 11 disks
/sbin/mdadm --create --metadata=1.2 --verbose /dev/md0 --level=6 --raid-devices=11 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1
mkfs.xfs -d su=64k,sw=9 -i attr=2 -l internal,version=2 /dev/md0
desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    88m35.789s

# amazingly similar times for software RAID6 and RAID10

rsync -axSHR ./cinghome/ -e "ssh -2 -x -c arcfour" cabbage:/local/backup

# configue maris with "winning" RAID level and other config
mkfs.xfs -d su=1024k,sw=5 -i attr=2 -l internal,version=2 $IDEV
# oh no, just 60MB/sec yet managed 80MB/sec before - must be missing something...
1156684185600 bytes (1.2 TB) copied, 2221.45 seconds, 521 MB/s

# configure desiree and cabbage with RAID10 over 10 disks with 1MB stripe size
# desiree
real    30m0.640s
(128*1000)/(30*60) = 71MB/sec
# cabbage
real    29m52.322s
(128*1000)/(30*60) = 71MB/sec

# run test, again, on maris
slower... 60BM/sec odd

# swapped disks between maris and desiree.
RAID cards give warnings, press "f" to import configuration from disks.
Then a simple /etc/fstab edit to take the different pv lvm name.
Wait for the consistency check to finish before read test

maris:/local/sync/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    30m16.459s

desiree:/local/backup/cinghome $ time tar cpf - ./gr ./chaos -b 300 | dd of=/dev/null bs=100MB
real    38m33.996s

# looks like something is wrong with the set of disks that was in maris that are now 
# in desiree. Conclusion: use maris (now) for tape backup.

# Write tests (of small files)
# in /local/backup/cinghome/ or /local/sync/cinghome/
time rsync -axSHR gr ../
# desiree
real    62m48.320s
# maris
real    63m18.660s

# read/write test with RAID10, 10 disks, 64k and 512k stripe size

# cabbage, 64k stripe size
mkfs.xfs -d su=64k,sw=5 -i attr=2 -l internal,version=2 $IDEV
real    95m53.450s
real    97m28.721s

# maris, 512k stripe size
mkfs.xfs -d su=512k,sw=5 -i attr=2 -l internal,version=2 $IDEV
real    64m35.318s
real    64m50.431s

# desiree, 1024k stripe size, with "slow" set of disks swapped from maris
real    62m12.340s
real    62m49.950s

conclude that 1MB stripe size for RAID10 is optimal for read and write of
small files

# checking RAID initialisation status
 /usr/local/sbin/MegaCli -LDInfo -LALL -aALL

# tar test of 1.2TB to tape, maris
cd /local/sync
time tar cpf /dev/st0 ./cinghome ./cinghome2
real    746m3.958s
12.43 hours for 1.2TB (this was 28.2 hours)
(1260*1000)/(746*60) = 28.15 MB/sec

# maybe try changing the lvm "PE Size" (vgdisplay reports this as 32MB)

# small files write test using rsync from same file-system to itself.
cabbage:/local/backup/cinghome $ time rsync -axSHR waves ../
6.3G    waves
cabbage:/local/backup/cinghome $ time rsync -axSHR waves ../
real    3m5.448s
(6.3*1000)/(3*60) = 35MB/sec
cabbage:/local/backup/cinghome $ time rsync -axSHR gr ../
real    61m9.404s
(105*1000)/(61*60) = 28.7MB/sec

# similar test on cingulum
cingulum:/local/cinghome $ time rsync -axSHR waves ./temp/
real    33m43.245s
user    1m17.246s
sys     2m23.737s
(6.3*1000)/(33*60) = 3.18 MB/sec 
not really fair as cingulum is hosting live home directories

# maris setup for DAMTP home directories (in the first instance)
# RAID10 stripe size of 1024k
maris:/local/marishome/cinghome $ time rsync -axSHR gr ../
real    58m41.427s
user    18m37.255s
sys     15m8.833s


# try with xen