Thursday, December 18, 2008

I wrote this script to create VMs, since we're making 350 for the remote access project. It used some code that a co-worked found on the internet, but I modified it appropriately so it would loop and create more than one.

#!/bin/bash
declare -i hexcount=0x0
for vmcount in $(seq 1 5)
do
VMDIR=vdi$vmcount
VMNAME=vdi$vmcount
VMMAC=00:50:56:00:01:$(printf %02x $hexcount)
NFSVOL="cgy-vdi-nfs01"

#Begin pasted script code
#Creates a new Virtual Machine

VMOS="winxppro"
VMDSIZE="20g"
VMMEMSIZE="1024"

mkdir /vmfs/volumes/$NFSVOL/$VMDIR
exec 6>&1
exec 1>/vmfs/volumes/NFSVOL/$VMDIR/$VMNAME.vmx

# write the configuration
echo #!/usr/bin/vmware
echo config.version = '"'8'"'
echo virtualHW.version = '"'4'"'
echo memsize = '"'$VMMEMSIZE'"'
echo floppy0.present = '"'TRUE'"'
echo usb.present = '"'FALSE'"'
echo displayName = '"'$VMNAME'"'
echo guestOS = '"'$VMOS'"'
echo ide0:0.present = '"'TRUE'"'
echo ide0:0.fileName = '"'/vmfs/volumes/vdi-scsi-local/BootableX86_12092008.iso'"'
echo ide0:0.deviceType = '"'cdrom-image'"'
echo ide:0.startConnected = '"'TRUE'"'
echo floppy0.startConnected = '"'FALSE'"'
echo floppy0.fileName = '"'/dev/fd0'"'
echo Ethernet0.present = '"'TRUE'"'
echo Ethernet0.networkName = '"'vmnts222'"'
echo Ethernet0.addressType = '"'static'"'
echo Ethernet0.address = '"'$VMMAC'"'
echo scsi0.present = '"'TRUE'"'
echo scsi0:1.present = '"'TRUE'"'
echo scsi0:1.filename = '"'$VMNAME.vmdk'"'
echo scsi0:1.writeThrough = '"'TRUE'"'
echo scsi0.virtualDev = '"'lsilogic'"'
echo
# close file
exec 1>&-

# make stdout a copy of FD 6 (reset stdout), and close FD6
exec 1>&6
exec 6>&-

chmod 755 /vmfs/volumes/$NFSVOL/$VMDIR/$VMNAME.vmx
# Create Disk & Register the .vmx configuration

#Creates 300mb disk (can be modified for larger disk sizes)
cd /vmfs/volumes/$NFSVOL/$VMDIR
vmkfstools -c $VMDSIZE $VMNAME.vmdk -a lsilogic

#Register VM
vmware-cmd -s register /vmfs/volumes/$NFSVOL/$VMDIR/$VMNAME.vmx

((hexcount=hexcount+1))
done

Monday, October 27, 2008

AMD cpu mask script

I wrote this script when we got a HP DL 585 G2 server to add to our dev/test cluster, which is mostly G1s. It checks all the vmx files on a server, and modifies them to have the correct CPU masking for G1/G2 server vmotions to be successful.

#!/bin/bash

PATH=$PATH:/bin:/usr/bin

for i in `vmware-cmd -l`
do
cp -f --reply=yes $i $i.back
if [ `grep -c "cpuid.80000001.edx =" $i` = 0 ]
then
echo 'cpuid.80000001.edx = "-----------0--------H-----------"' >> $i
else
cp -f --reply=yes $i $i.sed1
sed -e 's/cpuid.80000001.edx = "--------------------H-----------"/cpuid.80000001.edx = "-----------0--------H-----------"/g' $i.sed1 > $i.sed2
sed -e 's/cpuid.80000001.edx = "-----------0--------------------"/cpuid.80000001.edx = "-----------0--------H-----------"/g' $i.sed2 > $i
fi
if [ `grep -c cpuid.80000001.edx.amd $i` = 0 ]
then
echo 'cpuid.80000001.edx.amd = "----0R-----0--------H------T----"' >> $i
else
cp -f --reply=yes $i $i.sed1
sed -e "s/cpuid.80000001.edx.amd = \"-----R--------------H------T----\"/cpuid.80000001.edx.amd = \"----0R-----0--------H------T----\"/g" $i.sed1 > $i.sed2
sed -e "s/cpuid.80000001.edx.amd = \"-----R-----0--------H------T----\"/cpuid.80000001.edx.amd = \"----0R-----0--------H------T----\"/g" $i.sed2 > $i.sed3
sed -e "s/cpuid.80000001.edx.amd = \"-----------0--------------------\"/cpuid.80000001.edx.amd = \"----0R-----0--------H------T----\"/g" $i.sed3 > $i
fi
done

Wednesday, September 24, 2008

More useful scripts

Wrote a couple handy scripts lately. First one checks for ISOs mounted on VMs, because if they are, you can't VMotion them (and then often have to shut the guest OS down - annoying!)

#! /bin/bash
rm -rf /root/isocheck.txt
touch /root/isocheck.txt
for host in `cat isohosts.txt`
do
echo $host
for vmx in `ssh $host vmware-cmd -l`
do
echo $vmx >> isocheck.txt
ssh $host grep -i 'ide0:0.startConnected\ \=\ \"true\"' $vmx >> isocheck.txt
done
done


I have MIME-Lite installed on this ESX box, which is a simple perl-based SMTP agent, it emails me this file afterwards.

The second script sets up all the options to authenticate accounts on an ESX server user a domain account (in this example, esxrangerservice). You need only do "useradd username" after this to allow new users to connect.

#!/bin/bash
/usr/sbin/esxcfg-firewall -o 88,tcp,out,KerberosClient
/usr/sbin/esxcfg-firewall -o 464,tcp,out,KerberosPasswordChange
/usr/sbin/esxcfg-auth --enablead --addomain=yourdomain.com --addc=yourDC.yourdomain.com
/usr/sbin/esxcfg-auth --enablekrb5 --krb5realm=yourdomain.com --krb5kdc=yourDC.yourdomain.com
cp /etc/pam.d/vmware-authd /etc/pam.d/vmware-authd.old
/usr/sbin/useradd esxrangerservice
echo "#%PAM-1.0" > /etc/pam.d/vmware-authd
echo "auth sufficient pam_unix_auth.so shadow nullok" >> /etc/pam.d/vmware-authd
echo "auth required pam_stack.so service=system-auth" >> /etc/pam.d/vmware-authd
echo "account required pam_stack.so service=system-auth" >> /etc/pam.d/vmware-authd

Tuesday, September 23, 2008

Scripted ESX build with 3.5 not working

I've been scripting our builds for a while now, but haven't done one since our 3.5 upgrade - I'm just going through a test one now before we roll out some new hosts, and my script process which was working great before doesn't seem to produce a working config anymore.

The script code for routing looks like this - there is some switch config done as well, but since the output is below of esxcfg-vswitch and such, i left it out.

#Service Console
#Add service console IP address
echo Adding service console IP address
/usr/sbin/esxcfg-vswif -a -i 172.31.8.168 -n 255.255.255.0 -p "Service Console" vswif1
sleep 5

#Kernel Network
echo Adding Kernel Network Address
/usr/sbin/esxcfg-vmknic -a -i 172.31.8.169 -n 255.255.255.0 VMotion
sleep 5

#Service Console Routing
#Add esx route to VMkernel
echo Adding default route to Service Console
/usr/sbin/esxcfg-route 172.31.8.1
sleep 3

#Routing
#add networking info to /etc/sysconfig/network
echo adding route
cat > /etc/sysconfig/network << EOF2
NETWORKING=yes
HOSTNAME=esxt01.domain.com
GATEWAY=172.31.8.1
GATEWAYDEV=vswif1
EOF2
sleep 3

Here's the contents of every file and command I could think of thats even remotely related.

/etc/sysconfig/network-scripts/ifcfg-vswif1
DEVICE=vswif1
MACADDR=00:50:56:48:9e:17
PORTGROUP="Service Console"
BOOTPROTO=static
BROADCAST=172.31.8.255
IPADDR=172.31.8.168
NETMASK=255.255.255.0
ONBOOT=yes

/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=esxt01.domain.com
GATEWAY=172.31.8.1
GATEWAYDEV=vswif1

/etc/hosts

127.0.0.1 localhost.localdomain localhost
172.31.0.251 nts200.domain.com nts200
172.31.0.248 nts203.domain.com nts203
172.31.5.148 nts334.domain.com nts334
172.31.8.168 esxt01.domain.com esxt01

esxcfg-vmknic -l
Interface Port Group IP Address Netmask Broadcast MAC Address MTU TSO MSS Enabled
vmk0 VMotion 172.31.8.169 255.255.255.0 172.31.8.255 00:50:56:78:d6:51 1500 disabled true

esxcfg-vswif -l
Name Port Group IP Address Netmask Broadcast Enabled DHCP
vswif1 Service Console 172.31.8.168 255.255.255.0 172.31.8.255 true false

esxcfg-vswitch -l
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
Production 64 3 64 1500 vmnic1

PortGroup Name VLAN ID Used Ports Uplinks
vmnts08 8 0 vmnic1

Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
Management 64 5 64 1500 vmnic0

PortGroup Name VLAN ID Used Ports Uplinks
VMotion 8 1 vmnic0
Service Console 8 1 vmnic0

Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
TestPrivateNetwork64 1 64 1500

PortGroup Name VLAN ID Used Ports Uplinks
TestPrivateNetwork 0 0

esxcfg-route -l
VMkernel Routes:
Network Netmask Gateway
172.31.8.0 255.255.255.0 Local Subnet
default 0.0.0.0 172.31.8.1

the route command produces the following, with a big delay (10-15 seconds) before it displays the third line with the default route on it.

kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
172.31.8.0 * 255.255.255.0 U 0 0 0 vswif1
169.254.0.0 * 255.255.0.0 U 0 0 0 vswif1
default 172.31.8.1 0.0.0.0 UG 0 0 0 vswif1

I just get a "Destination Host Unreachable" when I try to ping anything, which tells me it doesn't know how to route to the network. whiskey tango foxtrot.

Friday, September 5, 2008

esxRanger Backups to DMZ machines

I have setup ranger to run all our VM backups - its working pretty well, but a recent upgrade caused machines in our DMZ to do this:

Error In Backup Operations! Error: An error occurred during backup operations. The Archive Created Appears to be Invalid. Failed to read Status File.

As it turns out, the upgrade needs a whole bunch more ports to work through to the DMZ now. Since thats a terrible idea, I turned on the "-encryptdata" option, which means everything goes over port 22 now - much better idea for security, although slower.

Full error log is below.

:: vRanger PRO CLI Backup Commencing ::
Version: 3.2.3.3
"C:\Program Files\vizioncore\esxRanger Professional\esxRangerProCli.exe" esxd03.domain.com /vmfs/volumes/vmdmzprod01/dmz17/dmz17.vmx -copylocal "Z:\cgy-dmz-thur" "-drives:all" -mailonerror -vmnotes -noquiesce -diffratio 50 -maxfullage 5 -retendays 14 -description "cgy dmz thursday full" -zipname "[config]" -vmkey vm-658
[20080905_110907]
Sending log on error only.
The TarBall name will be: [dmz17].
VirtualCenter VM Key: vm-658.
Acquiring VM Lock. This may take many minutes!
VM Lock Acquired. Backup Initialization Commencing. Please Wait...
9/5/2008 11:09:28 AM: Backing up the VM 'dmz17'.
9/5/2008 11:09:28 AM: Beginning Testing Phase.
9/5/2008 11:09:28 AM: Testing Destination Path.
9/5/2008 11:09:33 AM: Snapshotting VM.
9/5/2008 11:09:35 AM: Starting Server Component.
9/5/2008 11:09:37 AM: Performing Backup.
9/5/2008 11:19:42 AM: An error occurred during backup operations:
The Archive Created Appears to be Invalid. Failed to read Status File.
9/5/2008 11:19:42 AM: Removing Backup Snapshot.
9/5/2008 11:20:16 AM: Setting Final Backup Data.
9/5/2008 11:20:16 AM: Writing Backup Information.
9/5/2008 11:20:17 AM: Writing VM Note.
Error In Backup Operations! Error: An error occurred during backup operations. The Archive Created Appears to be Invalid. Failed to read Status File.
9/5/2008 11:20:21 AM: Disconnecting Open Connections.
9/5/2008 11:20:21 AM: Waiting for Disconnection.
9/5/2008 11:20:22 AM: Done.
9/5/2008 11:20:22 AM: Disconnection Sequence Complete.
Pausing for 30 seconds...
You can safely close this window now.

Tuesday, September 2, 2008

I guess people think IT is like a toaster...

This woman sent this email to all of IS. Managers, programmers, everyone...

That being said, it made me think about why she would send it. Anyone who can use email must get spam, so if she opens one of the attachments, she would probably see that. Its a common method of getting around email filters - and one that normally gets filtered by our message filters.

I guess she just assumes because it isn't really to her, that its an error, not some person on the internet being shady/clever (it is a pretty clever way of doing it frankly). When she pushes the bread down, she expects toast back up, not a bagel.



I am receiving tons of these notifications about e-mails that I have not sent. Can you please look into for me.
-----Original Message-----
From: postmaster@smtp01.2s1n.com [mailto:postmaster@smtp01.2s1n.com]Sent: Monday, September 01, 2008 11:55 AM
To: *********
Subject: Delivery Status Notification (Delay)

This is an automatically generated Delivery Status Notification.
THIS IS A WARNING MESSAGE ONLY.
YOU DO NOT NEED TO RESEND YOUR MESSAGE.
Delivery to the following recipients has been delayed.
webmaster@progressive-equip.com

Monday, August 25, 2008

Back from Vacation!

Most of August was vacation time! Enjoyed the Shuswap lake in BC where my parents have a house - very relaxing. The kids spent most of the summer there too, and they can swim and waterski now too.

Thursday, July 31, 2008

Vmware snapshots 160Gb!

One of our field ESX servers had a snapshot from Ranger that didn't get deleted, and grew to 160Gb, freezing the VM and making people at the site unable to log in. I set the snapshot to delete, but its been more than 24 hours in deleting.

Turns out the only way to view a snapshot being deleted is if the VM is on, you can run esxtop, then press "e" and enter the ID of the process for the VM. This will expand the process for that VM, and you check for SnapshotVMXCombo process for that group ID. This means the snapshot process is still running.

If the VM is off, there's apparently no way to tell if its still deleting or not. wow.

Wednesday, July 30, 2008

This set of scripts saved my ass 100% just now. We've had a few issues with VMware servers being disconnected due to network incidents - not just one but the whole farm goes down (they all think they are isolated, and in appropriate fashion shut down their VMs, expecting one of the other HA cluster members to power them up - however all the hosts are isolated, so there isn't a cluster up for the VMs to be powered up on).

Anyway, following the 2nd time that happened, I wrote some scripts, first to log what VMs are running on a host each night, and the second to power those VMs up (I wrote a third which also gracefully shuts down all the VMs on a host, its handy, I posted that one third).

The reason it saved me is because I had just had to hard reboot an completely unresponsive ESX host, and there are a couple servers that weren't powered on for various reasons, and I forgot to write down which ones beforehand. Yay for pre-emptive scripting!

vmstate script:
#! /bin/bash
echo This script exports the VMs that are running to a text file for later startup/shutdown operations
rm -rf /root/vmlist
rm -rf /root/vmonlist
rm -rf /root/vmofflist
touch /root/vmlist
touch /root/vmonlist
touch /root/vmofflist
ON="on"
for vm in $( vmware-cmd -l );
do
echo $vm >> vmlist
done
for vm2 in $( cat /root/vmlist );
do
state=$( vmware-cmd -q $vm2 getstate );
if [ "$state" = "$ON" ]
then echo $vm2 >> vmonlist
else echo $vm2 >> vmofflist
fi
done

vmstart script:
#! /bin/bash
echo This script starts all VMs listed in /root/vmonlist
echo If you are recovering from an incident, this list was generated at 5:30 PM
echo If you are unsure, please quit and ask someone else
OPTIONS="Proceed Quit"
select opt in $OPTIONS;
do
if [ "$opt" = "Quit" ]; then
exit
elif [ "$opt" = "Proceed" ]; then
for vm in $( cat /root/vmonlist );
do
vmware-cmd -q $vm start
done
exit
else
echo bad option
fi
done


vmstop script
#! /bin/bash
echo This script will shutdown all running Virtual Machines
echo ARE YOU SURE YOU WANT TO DO THIS?
OPTIONS="YES NO"
select opt in $OPTIONS;
do
if [ "$opt" = "NO" ]; then
exit
elif [ "$opt" = "YES" ]; then
for vm in $( cat /root/vmonlist );
do
echo shutting down $vm
vmware-cmd -q $vm stop trysoft
done
echo Waiting 5 minutes, then forcing shutdown
sleep 5m
for vm in $( cat /root/vmonlist );
do
vmware-cmd -q $vm stop hard
done
else
echo bad option
fi
done

I run a cron job at 5:30 every night to spit out the powered-on VMs into the file. Thats mostly because VMs move around mostly during the day, while a couple of the junior admins might be relocating or powering up new ones. Hope this helps someone!
I bought a Wii Fit yesterday, and am going to track my progress on wiifits.blogspot.com - I'm going to try to convince stevie to do it as well.

Monday, July 28, 2008

I hate email users

We got this string of emails forwarded to our company. I have removed the people on the to: list because I have no evidence that they forwarded it, unlike the people in the from fields.

Normally this sort of thing wouldn't concern me, we've locked down our top-level distribution lists to remove the possibility of anyone forwarding this to mass quantities of people. However some --enterprising-- (read: we've locked down the lists to keep you from doing this) user put EVERY OTHER MAILING LIST in a single email, and forwarded it to the WHOLE COMPANY. And then OTHER USERS STARTED RE-FORWARDING IT TO THE WHOLE COMPANY. And then OTHER USERS STARTED REPLYING TO THE WHOLE COMPANY TO PLEASE STOP FORWARDING THIS. And then finally, the last guy sent "ditto", which reminds us of course, of the dilbert cartoon about this exact subject.

The help desk "spoke directly to, and reminded each user of our email acceptable user policy". Glad I wasn't doing it, because there would have been some crying going on, and I probably would have got fired.

From: Goddard, Shando [mailto:SGoddard@petro-canada.ca]
Sent: Monday, July 07, 2008 4:50 PM
To: removed
Subject: FW: I DON'T KNOW HOW IT WORKS, BUT IT DOES.

--------------------------------------------------------------------------------
From: Appleton, Lee
Sent: Monday, July 07, 2008 7:10 AM
To: removed
Subject: FW: I DON'T KNOW HOW IT WORKS, BUT IT DOES.

Best Regards,

D.L. (Lee) Appleton

--------------------------------------------------------------------------------
From: McKinnon, Jim
Sent: Sunday, July 06, 2008 9:02 PM
To: removed
Subject: FW: I DON'T KNOW HOW IT WORKS, BUT IT DOES.

--------------------------------------------------------------------------------
From: Robert Lawson [mailto:palawson@telusplanet.net]
Sent: Sunday, July 06, 2008 2:34 PM
To: removed
Subject: FW: I DON'T KNOW HOW IT WORKS, BUT IT DOES.


Best of luck to everyone.



-----Original Message-----
From: Lawson, Robert [mailto:BoLawson@petro-canada.ca]
Sent: Sunday, July 06, 2008 2:43 PM
To: Lawson, Robert
Subject: FW: I DON'T KNOW HOW IT WORKS, BUT IT DOES.

--------------------------------------------------------------------------------

From: Campbell, Dennis
Sent: Thursday, July 03, 2008 8:41 AM
To: removed
Subject: FW: I DON'T KNOW HOW IT WORKS, BUT IT DOES.

This is a just in case it does work.

--------------------------------------------------------------------------------

From: Marty Price [mailto:Marty.Price@Halliburton.com]
Sent: Thursday, July 03, 2008 6:39 AM
To: removed
Subject: FW: I DON'T KNOW HOW IT WORKS, BUT IT DOES.

Blackberry post!

My first mobile post. My new BB is easier to type on, full keyboard
and all. My old one suffered a "shuswap water" treatment and stopped
functioning, surprise surprise.

Anyway, fragapalooza is coming up, paul brad brent igor and laz are
all attending, should be fun. Brad sent me a link to his tech blog
www.bradstechblog.com and I remembered that SCCM makes baby jesus cry.

--
Sent from Gmail for mobile | mobile.google.com

Ack I should do more regular updates

I figured out how to post to the blog remotely using my blackberry, so hopefully more posts will come, and more frequently.

We upgraded our netapp filers in 7.2.4 last week, and one of our LUNs had duplicate name mapping - how the hell does that happen? Anyway, all the VMs on that LUN had to be moved after the outage, thankfully nothing serious happened, but it just proves why we have jobs - IT stuff will always be ahead of the knowledge curve of normal users.

Sure they're catching up to IT workers in knowledge, but we'll always be ahead because IT will continue to become more complicated. Users can do a lot of the things for themselves that administrators used to have to do 10 years ago, but now we have more complicated tasks (like managing VMware infrastructure).

Stevie went to Kate's stag party on Saturday night, had to pick her drunk ass up at a Bar :) Hopefully Ken's will be just as much fun. Wait didn't they do these things the first time they got married? I'm glad I didn't know them the first time, because if I'd bought them a gift then, I would have stolen it back and re-gifted it to them.

Got into the Warhammer Online beta - can't discuss further, NDA etc...