The errors seen in the dsmerror.log were:
07/22/2014 13:27:59 ANS9365E VMware vStorage API error for virtual machine '<vm hostname>'.
TSM function name : visdkCreateVmSnapshotMoRef
TSM file : vmvisdk.cpp (5416)
API return code : 67
API error message : Another task is already in progress.
07/22/2014 13:27:59 ANS5250E An unexpected error was encountered.
TSM function name : vmVddkFullVMPrePareToOpenVMDKs
TSM function : snapshot targetMoRefP is null
TSM return code : 115
TSM file : ..\..\common\vm\vmbackvddk.cpp (12439)
07/22/2014 13:27:59 ANS1228E Sending of object '<vm hostname>' failed.
07/22/2014 13:27:59 ANS4015E Error processing '<vm hostname>': unexpected TSM error (115)
The problem was at the vm level. We found a hung snap vmtools install preventing the snapshot execution.
Resolution: On the vCenter console we manaully stopped the VMtools install and were ableto manually execute a TSM4VE back-up
TSM and SAN administration notebook
Problem resolution and how-to notes
Wednesday, July 23, 2014
Friday, July 11, 2014
ANS1311E Server out of Storage space -- TSM4VE back-ups fail
The Tivoli Storage Manager for Virtual Environments (TSM4VE) is a complex infrastructure to support and trouble-shoot.
Recently, this problem re-occured and so I thought that I'd give back to the TSM community online that has helped me so many times resolve problems.
TSM4VE back-ups can fail for many reasons. If you manuallly run the back-up, then you can retrieve helpful errors OR you cna look at the dsmerror.log. Many times the errors are generic enough to confused admins who have little experience.
Our TSM4VE back-ups were failing with the error message:
TSM4VE back-ups fail ANS1311E Server out of Storage space
Now, TSM4VE back-ups use 2 storagepools: one for data and one for vmware control files.
We use a Falconstor VTL to store data and a diskpool.
So, first check the VTL for scratch volumes by logging into dsadmc:
q libv vlib1
VLIB1 C1128OL4 Scratch 3,929 LTO
VLIB1 C1128PL4 Scratch 3,930 LTO
VLIB1 C1128QL4 Scratch 3,931 LTO
VLIB1 C1128RL4 Scratch 3,932 LTO
VLIB1 C1128SL4 Scratch 3,933 LTO
VLIB1 C1128TL4 Scratch 3,934 LTO
q libv vlib2
VLIB2 C2120BL4 Scratch 3,628 LTO
VLIB2 C2120CL4 Scratch 3,629 LTO
VLIB2 C2120DL4 Scratch 3,630 LTO
VLIB2 C2120EL4 Scratch 3,631 LTO
VLIB2 C2120FL4 Scratch 3,632 LTO
VLIB2 C2120GL4 Scratch 3,633 LTO
Well, have scratches so there is plenty of space for the VMs' data.
Next, we checked to see if we have space for the VMs' control files.
q stgpool VMCTLPOOL
Storage Device Estimated Pct Pct High Low Next Stora-
Pool Name Class Name Capacity Util Migr Mig Mig ge Pool
Pct Pct
----------- ---------- ---------- ----- ----- ---- --- -----------
VMCTLPOOL DISK 1,418 G 100.00 100.00 99 94
So, there is no space to store new VM control files.
So we created another volume in the VMCTLPOOL.
def volume VMCTLPOOL /vmctl1/stg33.dsm formatsize=42709
After, this completed back-ups stalled started sessions with the TSM server and transferred their back-up.
Recently, this problem re-occured and so I thought that I'd give back to the TSM community online that has helped me so many times resolve problems.
TSM4VE back-ups can fail for many reasons. If you manuallly run the back-up, then you can retrieve helpful errors OR you cna look at the dsmerror.log. Many times the errors are generic enough to confused admins who have little experience.
Our TSM4VE back-ups were failing with the error message:
TSM4VE back-ups fail ANS1311E Server out of Storage space
Now, TSM4VE back-ups use 2 storagepools: one for data and one for vmware control files.
We use a Falconstor VTL to store data and a diskpool.
So, first check the VTL for scratch volumes by logging into dsadmc:
q libv vlib1
VLIB1 C1128OL4 Scratch 3,929 LTO
VLIB1 C1128PL4 Scratch 3,930 LTO
VLIB1 C1128QL4 Scratch 3,931 LTO
VLIB1 C1128RL4 Scratch 3,932 LTO
VLIB1 C1128SL4 Scratch 3,933 LTO
VLIB1 C1128TL4 Scratch 3,934 LTO
q libv vlib2
VLIB2 C2120BL4 Scratch 3,628 LTO
VLIB2 C2120CL4 Scratch 3,629 LTO
VLIB2 C2120DL4 Scratch 3,630 LTO
VLIB2 C2120EL4 Scratch 3,631 LTO
VLIB2 C2120FL4 Scratch 3,632 LTO
VLIB2 C2120GL4 Scratch 3,633 LTO
Well, have scratches so there is plenty of space for the VMs' data.
Next, we checked to see if we have space for the VMs' control files.
q stgpool VMCTLPOOL
Storage Device Estimated Pct Pct High Low Next Stora-
Pool Name Class Name Capacity Util Migr Mig Mig ge Pool
Pct Pct
----------- ---------- ---------- ----- ----- ---- --- -----------
VMCTLPOOL DISK 1,418 G 100.00 100.00 99 94
So, there is no space to store new VM control files.
So we created another volume in the VMCTLPOOL.
def volume VMCTLPOOL /vmctl1/stg33.dsm formatsize=42709
After, this completed back-ups stalled started sessions with the TSM server and transferred their back-up.
Wednesday, September 28, 2011
XIV basics -- easy tutorial
So, I was asked to create 3 new volumes and map them to 2 different XIV
hosts. I had no training. I did have a log in to the XIV.
I found:
http://www-03.ibm.com/systems/data/flash/storage/disk/demos/xiv_gui.html
This demo really helped me learn some basics!
hosts. I had no training. I did have a log in to the XIV.
I found:
http://www-03.ibm.com/systems/data/flash/storage/disk/demos/xiv_gui.html
This demo really helped me learn some basics!
Wednesday, September 21, 2011
Expanding a LUN served from the SVC
Around here, this is a task that MUST be performed like at 1:30 in the afternoon on a Friday.
Why? Procrastination. The filespace monitor (whoever that is) decides that 1 minute after they get back from their lunch they will check 'their' server before they go out of town for the weekend to see what the system admin. overlooked.
This chore is easy and straight forward depending on how much more they want,available resources, how many data drives they currently have on the server we are concerned about..
First, ask how much more space they need (not want) and how many data drives they have on that server. If they have only one data drive then the drive name to extend is <Hostname1>.
If they have more than one data drive and it is Windows, then ask the Sys Admin. how large the drive is currently. If it is a VMware server then ask the system admin for the LUN unique ID.
It will look like this: 6005076801908128C0000000000003C0.
You can match it on the SVC with the next command.
Next, log into the SVC. I will be using the hostname Goliath in this explanation for my examples.
Then runto find the disks mapped to the host:
svcinfo lshostvdiskmap Goliath
the output will look like this
id name SCSI_id vdisk_id vdisk_name vdisk_UID
35 Goliath 0 195 Goliath0 6005076801908128C0000000000003C0
35 Goliath 1 197 Goliath1 6005076801908128C0000000000003C1
this will show you all the disks mapped to the host.
Determine which of these drives you want to expand. If they have multiple data drives, then find the drive that is the same size that the Windows Sys. admin. told you or the LUN unique id number the VMware admin gave you.
Next, run this command:
svcinfo lsvdisk -filtervalue name=Goliath1
id name IO_group_id IO_group_name status mdisk_grp_id mdisk_grp_name capacity type FC_id FC_name RC_id RC_name vdisk_UID fc_map_count copy_count fast_write_state
33 Goliath1 1 iog1 online 6 mdg2107_15_300 64.00GB striped 6005076801908128C0000000000002D5 0 1 empty
Goliath's managed disk group for Goliath1 (its data drive or LUN) is mdg2107_15_300.
In the following step you are looking to see if there is enough space in the managed disk group to expand that LUN to the desired size.
svcinfo lsmdiskgrp -filtervalue name=mdg2107_15_300
The output will look like this:
id name status mdisk_count vdisk_count capacity extent_size free_capacity virtual_capacity used_capacity real_capacity overallocation warning
102 mdg2107_15_300 online 30 31 54.6TB 512 5.7TB 48.89TB 48.89TB 48.89TB 89 0
So, in my example under free_capacity I have 5.7TB. So that extra 500GB is no big deal.
had I not enough space then I'd have to do one of two things:
1)Add a mdisk to the managed disk group
or
2) find an unused vdisk (maybe an old flash copy) make sure its not mapped to a host, and
do a rmvdisk to it.
To expand the LUN by 500 GB execute:
svctask expandvdisksize -size 500 -unit gb Goliath1
where Goliath1 is the disk that needed to be expanded.
Now contact the windows admin. and tell her that she has the
needed space but she needs to expand the drive.
Why? Procrastination. The filespace monitor (whoever that is) decides that 1 minute after they get back from their lunch they will check 'their' server before they go out of town for the weekend to see what the system admin. overlooked.
This chore is easy and straight forward depending on how much more they want,available resources, how many data drives they currently have on the server we are concerned about..
First, ask how much more space they need (not want) and how many data drives they have on that server. If they have only one data drive then the drive name to extend is <Hostname1>.
If they have more than one data drive and it is Windows, then ask the Sys Admin. how large the drive is currently. If it is a VMware server then ask the system admin for the LUN unique ID.
It will look like this: 6005076801908128C0000000000003C0.
You can match it on the SVC with the next command.
Next, log into the SVC. I will be using the hostname Goliath in this explanation for my examples.
Then runto find the disks mapped to the host:
svcinfo lshostvdiskmap Goliath
the output will look like this
id name SCSI_id vdisk_id vdisk_name vdisk_UID
35 Goliath 0 195 Goliath0 6005076801908128C0000000000003C0
35 Goliath 1 197 Goliath1 6005076801908128C0000000000003C1
this will show you all the disks mapped to the host.
Determine which of these drives you want to expand. If they have multiple data drives, then find the drive that is the same size that the Windows Sys. admin. told you or the LUN unique id number the VMware admin gave you.
Next, run this command:
svcinfo lsvdisk -filtervalue name=Goliath1
id name IO_group_id IO_group_name status mdisk_grp_id mdisk_grp_name capacity type FC_id FC_name RC_id RC_name vdisk_UID fc_map_count copy_count fast_write_state
33 Goliath1 1 iog1 online 6 mdg2107_15_300 64.00GB striped 6005076801908128C0000000000002D5 0 1 empty
Goliath's managed disk group for Goliath1 (its data drive or LUN) is mdg2107_15_300.
In the following step you are looking to see if there is enough space in the managed disk group to expand that LUN to the desired size.
svcinfo lsmdiskgrp -filtervalue name=mdg2107_15_300
The output will look like this:
id name status mdisk_count vdisk_count capacity extent_size free_capacity virtual_capacity used_capacity real_capacity overallocation warning
102 mdg2107_15_300 online 30 31 54.6TB 512 5.7TB 48.89TB 48.89TB 48.89TB 89 0
So, in my example under free_capacity I have 5.7TB. So that extra 500GB is no big deal.
had I not enough space then I'd have to do one of two things:
1)Add a mdisk to the managed disk group
or
2) find an unused vdisk (maybe an old flash copy) make sure its not mapped to a host, and
do a rmvdisk to it.
To expand the LUN by 500 GB execute:
svctask expandvdisksize -size 500 -unit gb Goliath1
where Goliath1 is the disk that needed to be expanded.
Now contact the windows admin. and tell her that she has the
needed space but she needs to expand the drive.
Tuesday, September 20, 2011
SVC Flashing a copy of a boot LUN
Why would you do this operation? Well, if your OS team wants to install patches, but want to be able to fall back to a before-the-patches-were-install state, then you'd flash the boot LUN. In my case the OS people believed that the production boot LUN had some kind of corruption in it. Since the boot LUN is windows and there is no native OS back-up utility, they wanted a copy of the Test box's boot LUN presented to the production server. So, actually this should be titled "Flashing a copy of the boot LUN from test in order to give to prodution."
Well let's get to it!
What is the managed disk group for production? (boot LUN's are named Hostname+0)
svcinfo lsvdisk -filtervalue name=Eastwood0
From the output, find 2 facts: 1) the size of the boot LUN and 2) the name of your managed disk group.
Then check to see if you have enough space in the managed disk group by running:
svcinfo lsmdiskgrp -filtervalue name=mdg1746c_7_2t
If you have enough space, then make a new vdisk on which to place the copy.
From the first command, get the iogrp # and fr the second command get the mdiskgrp #.
To make the new virtual disk, run the following command:
svctask mkvdisk -name Eastwood0New -mdiskgrp 6 -size 64 -unit gb -iogrp 1
Now, to make the flash:
First, get the source drives id number with:
svcinfo lsvdisk -filtervalue name=EastwoodT0
If you want to flash copy the current production's LUN run:
svcinfo lsvdisk -filtervalue name=Eastwood0
with EastwoodT0 being the test server's LUN and
Eastwood0 being the production server's LUN
Then get the destination drive's id (the drive you just made)
svcinfo lsvdisk -filtervalue name=Eastwood0New
Next, map the drives you are mapping from and to:
svctask mkfcmap -source 250 -target 179 -name Eastwood0New
if this svctask works then you will get the message:
"Flashcopy mapping successfully created"
Now, start the flashcopy:
svctask startfcmap -prep Eastwood0new
To check on the progress of how much has been copied run:
svcinfo lsfcmap
Once, the flash is done, then the Windows system admin need to shutdown the windows server before you switch boot LUNS on this server. Once this is down you can unmap the current boot LUN:
svctask rmvdiskhostmap -host Eastwood Eastwood0
Then map the newly made flash of the test boot LUN:
svctask mkvdiskhostmap -host Eastwood Eastwood0New
Then call the system admin to boot that system
Well let's get to it!
What is the managed disk group for production? (boot LUN's are named Hostname+0)
svcinfo lsvdisk -filtervalue name=Eastwood0
From the output, find 2 facts: 1) the size of the boot LUN and 2) the name of your managed disk group.
Then check to see if you have enough space in the managed disk group by running:
svcinfo lsmdiskgrp -filtervalue name=mdg1746c_7_2t
If you have enough space, then make a new vdisk on which to place the copy.
From the first command, get the iogrp # and fr the second command get the mdiskgrp #.
To make the new virtual disk, run the following command:
svctask mkvdisk -name Eastwood0New -mdiskgrp 6 -size 64 -unit gb -iogrp 1
Now, to make the flash:
First, get the source drives id number with:
svcinfo lsvdisk -filtervalue name=EastwoodT0
If you want to flash copy the current production's LUN run:
svcinfo lsvdisk -filtervalue name=Eastwood0
with EastwoodT0 being the test server's LUN and
Eastwood0 being the production server's LUN
Then get the destination drive's id (the drive you just made)
svcinfo lsvdisk -filtervalue name=Eastwood0New
Next, map the drives you are mapping from and to:
svctask mkfcmap -source 250 -target 179 -name Eastwood0New
if this svctask works then you will get the message:
"Flashcopy mapping successfully created"
Now, start the flashcopy:
svctask startfcmap -prep Eastwood0new
To check on the progress of how much has been copied run:
svcinfo lsfcmap
Once, the flash is done, then the Windows system admin need to shutdown the windows server before you switch boot LUNS on this server. Once this is down you can unmap the current boot LUN:
svctask rmvdiskhostmap -host Eastwood Eastwood0
Then map the newly made flash of the test boot LUN:
svctask mkvdiskhostmap -host Eastwood Eastwood0New
Then call the system admin to boot that system
Monday, September 19, 2011
Commands to look around the SVC -> svcinfo
When I started SVC admin, I wanted the informational commands that would not get me into trouble
and would just show me how everything is configured. The subcommands of svcinfo fit the bill.
To look at the last commands run to change things:
svcinfo catauditlog
To look at the SVC's error log:
svcinfo caterrlog
To look at a whether a flash copy has completed or what flash copies you have:
svcinfo lsfcmap
Managed disk groups? these are LUNs given to the SVC and placed into
sets called managed disk groups or mdiskgrp. The point of the SVC is to be able to
'carve-up' LUNS of desired sizes and then share them with clients of the svc (hosts).
what managed disk groups are on your SVC?
svcinfo lsmdiskgrp
If you want to get useful infomation of a mdiskgrp:
svcinfo lsmdiskgrp mdg1746a_7_2t
or find all mdiskgrp's that start the same naming scheme:
svcinfo lsmdiskgrp -filtervalue name="mdg*"
Need to see all the vdisks that have been carved from a mdiskgrp?
svcinfo lsvdisk -filtervalue mdisk_grp_name=<name of a mdiskgrp>
Now, for the mdiskgrp and vdisk to be useful they must be mapped to hosts.
To see what host a vdisk is mapped to run:
svcinfo lsvdiskhostmap <vdiskname>
If the vdisk is not mapped to a host then you could map it to a host or remove it.
To see what vdisks are mapped to a host run:
svcinfo lshostvdiskmap <hostname>
These svcinfo commands will give you information only and will not change anything,
so feel free to run them and get familar with you SVC environment.
Here is a great IBM link
http://publib.boulder.ibm.com/infocenter/svcic/v3r1m0/index.jsp?topic=%2Fcom.ibm.storage.svc.console.doc%2Fsvc_informationcomm_21pasg.html
under "Informational Commands" are all the sub-options of svcinfo
and would just show me how everything is configured. The subcommands of svcinfo fit the bill.
To look at the last commands run to change things:
svcinfo catauditlog
To look at the SVC's error log:
svcinfo caterrlog
To look at a whether a flash copy has completed or what flash copies you have:
svcinfo lsfcmap
Managed disk groups? these are LUNs given to the SVC and placed into
sets called managed disk groups or mdiskgrp. The point of the SVC is to be able to
'carve-up' LUNS of desired sizes and then share them with clients of the svc (hosts).
what managed disk groups are on your SVC?
svcinfo lsmdiskgrp
If you want to get useful infomation of a mdiskgrp:
svcinfo lsmdiskgrp mdg1746a_7_2t
or find all mdiskgrp's that start the same naming scheme:
svcinfo lsmdiskgrp -filtervalue name="mdg*"
Need to see all the vdisks that have been carved from a mdiskgrp?
svcinfo lsvdisk -filtervalue mdisk_grp_name=<name of a mdiskgrp>
Now, for the mdiskgrp and vdisk to be useful they must be mapped to hosts.
To see what host a vdisk is mapped to run:
svcinfo lsvdiskhostmap <vdiskname>
If the vdisk is not mapped to a host then you could map it to a host or remove it.
To see what vdisks are mapped to a host run:
svcinfo lshostvdiskmap <hostname>
These svcinfo commands will give you information only and will not change anything,
so feel free to run them and get familar with you SVC environment.
Here is a great IBM link
http://publib.boulder.ibm.com/infocenter/svcic/v3r1m0/index.jsp?topic=%2Fcom.ibm.storage.svc.console.doc%2Fsvc_informationcomm_21pasg.html
under "Informational Commands" are all the sub-options of svcinfo
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness
So my work partner (He did SAN; I did TSM) found a great new job with IBM.
Since he is the sole support for his family; I am glad that he found such a great job and jump in pay.
Since I understood little of what he did and am expected to do so, I am afraid.
I was hired to take some of the AIX, TSM, and SAN workload off his shoulders and to back him up.
He was a patient teacher when I was learning TSM admin. His dry wit, compassionate ear of a husband and father who could share my joys and sorrows, being able to discuss our shared Christianity, his thoroughness and quick intelligence, his encyclopedic knowledge of music, movies, Texas cities, and County employees will be missed sorely. He was one of the best work partners that I have ever had the pleasure to work with... (dangling particple and all).
I am being expected to administer the following: IBM SVC, Brocade DSX, Xseries blade center switches, N-series (Netapp) for storage I have IBM's DS8100, 1746, 1726, and three XIV's.
Yes, my employer is seeking to fill this position but I am not optimistic.
After searching high and low for blogs of other professionals and finding only business sponsored blogs with no real technical content, I decided that as I learned administrative commands and concepts that I'd place my notes online. Not just to share but also to have as quick reference with such a variety to manage, I will not be able to memorize it all. When I have learned how to do something, I will post it.
Since he is the sole support for his family; I am glad that he found such a great job and jump in pay.
Since I understood little of what he did and am expected to do so, I am afraid.
I was hired to take some of the AIX, TSM, and SAN workload off his shoulders and to back him up.
He was a patient teacher when I was learning TSM admin. His dry wit, compassionate ear of a husband and father who could share my joys and sorrows, being able to discuss our shared Christianity, his thoroughness and quick intelligence, his encyclopedic knowledge of music, movies, Texas cities, and County employees will be missed sorely. He was one of the best work partners that I have ever had the pleasure to work with... (dangling particple and all).
I am being expected to administer the following: IBM SVC, Brocade DSX, Xseries blade center switches, N-series (Netapp) for storage I have IBM's DS8100, 1746, 1726, and three XIV's.
Yes, my employer is seeking to fill this position but I am not optimistic.
After searching high and low for blogs of other professionals and finding only business sponsored blogs with no real technical content, I decided that as I learned administrative commands and concepts that I'd place my notes online. Not just to share but also to have as quick reference with such a variety to manage, I will not be able to memorize it all. When I have learned how to do something, I will post it.
Subscribe to:
Posts (Atom)