2013-01-28

DevOpsCon 2013

DevOpsConToday I attended the DevOpsCon 2013 in Hertzliya, Israel. As you can see from the agenda there was an abundance of great sessions and I thoroughly enjoyed myself.

This was not like most of the one-day seminars I usually go to – it was focused 100% on DevOps – and virtualization was just a (very small) by-product – seeing that it was only the underlying infrastructure.

The speakers were very good (almost all of them), the content was very interesting and very though provoking. The food was fine, venue was pretty much ok (albeit a bit crowded).

I had a really good time, it was great to see the interactions on Twitter and the feedback that was given during the sessions (even though I expected a lot more – it seems that Israel is not that big on Twitter).

That's me

I will be going into depth with my thoughts on some of the sessions in future blog posts but here are 8 little gems (of many) that came out of the day.

  1. Microsoft are hosting the day. As soon as they understood that most people are Linux guys, the AC is arctic. Penguin is frozen #devopscon
  2. Interesting question - how does DevOps deal with Database changes? #devopscon
    There was no concrete answer!
  3. He's got Linux emulation on his Windows desktop, used to push code to his Linux VMs running on Azure #Aabomination #devopscon
  4. It is sooo obvious from this demo that Microsoft is so not in the #devops game #devopscon
  5. Culture is one of the biggest challenges #devopscon
  6. There are only two log levels:
    a. Too much bullshit
    b. I'm f***'ing blind!! #devopscon
  7. Human driven auto-scaling - is not realy the right way to do it #devopscon
  8. There is no such a thing as hard-coded dynamic IP's #devopscon

I really appreciate the feedback!

I would like to leave you with comment from Ben Kepes which is so true – about today – and always

Which translates to:

What is the most important thing in the world?
It is people! It is people! It is people!

I will post a link to the slide decks when they become available.

Updated: The slide decks are coming in.

How I Learned to Relax and Love the Logs, Avishai Ish-Shalom

Vagrant and Puppet, your Ops Sketching Board, Ronen Narkis

Little-Big Data Adventures, Or Cohen

Embedded below are the tweets from the conference Hashtag

Error Removing Nexus 1000V VEM

I encountered this last week and was not find any reference to my specific problem – so I am documenting it here.
I was trying to remove the Cisco Nexus 1000V VEM from the ESXi hosts in my lab.
This was the error I was getting.
Fail1
This is what I had from the esxupdate.log file

2013-01-24T08:35:32Z esxupdate: LiveImageInstaller: DEBUG: Starting to live remove VIBs: Cisco_bootbank_cisco-vem-v147-esx_4.2.1.1.5.2b.0-3.1.1
2013-01-24T08:35:32Z esxupdate: LiveImageInstaller: INFO: Live removing cisco-vem-v147-esx-4.2.1.1.5.2b.0-3.1.1
2013-01-24T08:35:32Z esxupdate: vmware.runcommand: INFO: runcommand called with: args = '['/sbin/chkconfig', '-B', '/etc/chkconfig.db', '-D', '/etc/init.d', '-i', '-o']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2013-01-24T08:35:32Z esxupdate: LiveImageInstaller: DEBUG: Running [['/etc/init.d/n1k-vem', 'stop', 'remove']]...
2013-01-24T08:35:32Z esxupdate: vmware.runcommand: INFO: runcommand called with: args = '['/etc/init.d/n1k-vem', 'stop', 'remove']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2013-01-24T08:35:33Z esxupdate: LiveImageInstaller: DEBUG: output: svsStop, remove
svsStopRemove
stopDpa
Stopping Cisco Nexus 1000V VEM
stopDpa
Unload N1k switch modules
Warning: /dev/char/vmkdriver/stun not found
Unload of N1k modules done.
2013-01-24T08:35:33Z esxupdate: LiveImageInstaller: DEBUG: Starting to run etc/vmware/shutdown/shutdown.d/*
2013-01-24T08:35:33Z esxupdate: LiveImageInstaller: DEBUG: Trying to unmount payload [cisco-vem-v147-] of VIB Cisco_bootbank_cisco-vem-v147-esx_4.2.1.1.5.2b.0-3.1.1
2013-01-24T08:35:33Z esxupdate: LiveImageInstaller: DEBUG: Unmounting cisco_ve.v00...
2013-01-24T08:35:33Z esxupdate: vmware.runcommand: INFO: runcommand called with: args = 'rm /tardisks/cisco_ve.v00', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2013-01-24T08:35:33Z esxupdate: LiveImageInstaller: DEBUG: output: rm: can't remove '/tardisks/cisco_ve.v00': Device or resource busy
<…truncated..>
2013-01-24T08:35:33Z esxupdate: root: ERROR: InstallationError: ([], "Error in running rm /tardisks/cisco_ve.v00:\nReturn code: 1\nOutput: rm: can't remove '/tardisks/cisco_ve.v00': Device or resource busy\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.")

I looked for some help on the web and came across this - Problems with uninstalling Nexus 1000v VEM VIB – and here it said perhaps the vem was still running.

So I tried that as well – here we see the vem is still running
vem status
Fail2
Even after stopping the vem – it would not remove the VIB. Maybe the modules were still loaded?
Fail3
Still no go…

I then came across these two KB’s

The vem-swiscsi process fails to exit even when no Software iSCSI device is found and High CPU and memory utilization by the vem-swiscsi process

They were not relevant to my versions – neither of ESXi nor the Cisco modules but still this led to the right solution.

I checked to see if I had any vem* processes still running.
lsof
After killing the processes.
kill process
The removal was successful.
Success!

The vem-swiscsi process was not killed properly when I stopped the vem (or removed the modules) – which I assume is a bug which was re-introduced since 4.2(1)SV1(5.1).

The Release Notes for Release 4.2(1)SV1(5.1) say that these bugs were resolved
17. CSCtl21012 The vem-swiscsi process fails to exit when no "Software iSCSI" device is found.
44. CSCtr83664 The vem-swiscsi process fails to exit when no "Software iSCSI" device is found.

In short – if you cannot remove the Cisco VEM from a ESXi host – check that there are no vem processes still running – that will prevent you removing the module.

I would like to also thank Frank Denneman for his very useful post on Removing orphaned Nexus DVS.

2013-01-24

Configuring SSH Equivalence for Oracle RAC

SSH Equivalence is one of the pre-requisites needed for an Oracle RAC installation. Scripting Fu
There are a number of posts on how to do this like here or here, and Oracle even have been so kind as to provide a script that will do this for you (even though it is not 100% automated.
The process is relatively simple (when you break it down piece by piece)
  1. Create the .ssh directory under the users /home folder for VM1 and VM2
  2. Create an RSA key on VM1 and VM2
  3. Copy the contents of ~/.ssh/id_rsa.pub from VM1 and VM2 into ~/.ssh/authorized_keys on both VM1 and VM2
  4. You should then be able to connect to each host (and also the localhost as well) without a password prompt.
  5. Repeat the process on both VM’s with the oracle user
But this process requires a decent amount of manual interaction from the user at the following stages:
  1. Copying the files between VM1 <-> VM2
  2. First connection prompts to add the hosts key to the ~/.ssh/known_hosts file
Manual interaction is the mother of all headaches when you want to automate something. As I have posted before here and here I am in the middle of automating a Oracle RAC deployment on VMware. This is an additional part of the solution.
I had to come up with a method to do this without any user interaction, and here is how I went about the process. I broke down the whole process – stage by stage.
  1. Re-create the ssh_host_rsa_key – the reason for this being – that since these VM’s are deployed from the same template – the ssh_host_rsa_key is identical – and this caused problems for my script (this actually could be useful in some cases – but not here).
  2. Create the ~/.ssh/id_rsa.pub key for the root user on each host – without prompts.
  3. In order to prevent the popup when connecting to another VM for the first time I needed to get the keys from ssh_host_rsa_key.pub into the .ssh/known_hosts before I connected to the VM for the first time.
  4. Add the public key from each VM into the ~/.ssh/authorized_keys file.
  5. Get this information from VM1 to VM2 and and vice-versa – and all of this without prompts – which meant I could not go through the guest operating system.
  6. Repeat the process for the oracle user.
So my initial challenge was how to do the copying of the files without going through the guest OS, but that actually turned out to be pretty simple. PowerCLI has the Copy-VMGuestFile cmdlet that will allow me to transfer files to and from the guest – so that solved my worries.
There were several issues along the way that I needed to address.
  1. I needed to construct the known_hosts file based on several pieces of information, the hostname, IP address and the ssh_host_rsa_key, I am sure there is a easier way of doing this in bash – but this way works for me.
  2. Creating the rsa keys for the oracle user – since I did not want to connect to the VM twice with two different credentials – here I solved the problem by duplicating the files from the root user to the oracle user and manipulated the contents a bit to suit my needs.
  3. Copying the files back to the guest after manipulation – resulted in a change in their format from UNIX to DOS and I could not find a way to control that from the PowerCLI side – therefore some vi manipulation was needed to convert them back.
So without further ado – here is the script – annotations are at the bottom
<#
 .SYNOPSIS
  Configure SSH equivalence between two Oracle RAC nodes

 .DESCRIPTION
  The script will execute on both guests, configure the RSA keys,
  known_hosts and authorized_keys files on each host for both the 
  root and oracle user to enable SSH equivalence for Oracle RAC

 .PARAMETER  VM1
  Name of the first VM
 .PARAMETER  VM2
  Name of the first VM
 .PARAMETER  VM1_IP
  The IP address of the first VM
 .PARAMETER  VM2_IP
  The IP address of the second VM
 .PARAMETER  HostCredentials
  The credentials for the ESXi host
 .PARAMETER  GuestCredentials
  The credentials for the guest VM 
 .PARAMETER Cleanup
  Will cleanup the temporary files created. On by default
 
 .EXAMPLE
  PS C:\> Set-SSHKeys -VM1 hosta -VM2 hostb -VM1_IP 10.10.10.1 -VM2_IP 10.10.10.2
  This example shows how to call the Configure-SSHKeys against hosta with the IP address
  of 10.10.10.1 and hostb with the IP address of 10.10.10.2.
 .EXAMPLE
  PS C:\> Set-SSHKeys -VM1 hosta -VM2 hostb -VM1_IP 10.10.10.1 -VM2_IP 10.10.10.2 -HostCredentials `
  (Get-Credential) -GuestCredentials (Get-Credential) -Cleanup:$false
  This example shows how to call the Configure-SSHKeys against hosta with the IP address
  of 10.10.10.1 and hostb with the IP address of 10.10.10.2. while prompting for credentials
  for both the host and the guest and not cleaning up the files after completion.

 .NOTES
  Author: Maish Saidel-Keesing
  Date: 20 January, 2012
  For more in depth info on the script please see:
  http://technodrone.blogspsot.com/2013/01/set-sshkeys.html

#>
function Set-SSHKeys {
 [CmdletBinding()]
 param(
  [Parameter(Position=0, Mandatory=$true)]
  [System.String]$VM1,
  [Parameter(Position=1, Mandatory=$true)]
  [System.String]$VM2,
  [Parameter(Position=2)]
  [System.String]$VM1_IP,
  [Parameter(Position=3)]
  [System.String]$VM2_IP,
  $HostCredentials,
  $GuestCredentials,
  $Cleanup=$true
 )
 # Check for parameters
 if (!$HostCredentials) {
  $HostCredentials = $Host.ui.PromptForCredential("ESXi Host Credentials","Enter the credentials for the ESXi Host","root","")
 }
 if (!$GuestCredentials) {
  $GuestCredentials = $Host.ui.PromptForCredential("Guest VM Credentials","Enter the credentials for the guest VM","root","")
 }
 if (!$VM1_IP) {
 $VM1_IP = (Get-VMGuestNetworkInterface -Name eth0 -vm $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials).IP
 }
 if (!$VM2_IP) {
 $VM2_IP = (Get-VMGuestNetworkInterface -Name eth0 -vm $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials).IP
 }
## script to be executed on VM1
$myscript1 = @"
mv /etc/ssh/ssh_host_rsa_key /etc/ssh/ssh_host_rsa_key.old
mv /etc/ssh/ssh_host_dsa_key /etc/ssh/ssh_host_dsa_key.old
ssh-keygen -t rsa -N "" -f /etc/ssh/ssh_host_rsa_key
ssh-keygen -t dsa -N "" -f /etc/ssh/ssh_host_dsa_key
mkdir ~/.ssh
ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa
echo -n `$(hostname -s) >> .ssh/known_hosts
echo -n "," >> .ssh/known_hosts
echo -n $VM1_IP >> .ssh/known_hosts
echo -n " " >> .ssh/known_hosts
cat /etc/ssh/ssh_host_rsa_key.pub >> .ssh/known_hosts
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
mkdir /home/oracle/.ssh
cp .ssh/* /home/oracle/.ssh/
chown -R oracle:dba /home/oracle/.ssh
"@
## script to be executed on VM2
$myscript2 = @"
mv /etc/ssh/ssh_host_rsa_key /etc/ssh/ssh_host_rsa_key.old
mv /etc/ssh/ssh_host_dsa_key /etc/ssh/ssh_host_dsa_key.old
ssh-keygen -t rsa -N "" -f /etc/ssh/ssh_host_rsa_key
ssh-keygen -t dsa -N "" -f /etc/ssh/ssh_host_dsa_key
mkdir ~/.ssh
ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa
echo -n `$(hostname -s) >> .ssh/known_hosts
echo -n "," >> .ssh/known_hosts
echo -n $VM2_IP >> .ssh/known_hosts
echo -n " " >> .ssh/known_hosts
cat /etc/ssh/ssh_host_rsa_key.pub >> .ssh/known_hosts
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
mkdir /home/oracle/.ssh
cp .ssh/* /home/oracle/.ssh/
chown -R oracle:dba /home/oracle/.ssh
"@
 # run the scripts on VM1 and VM2
 Invoke-VMScript -vm $VM1 -ScriptText $myscript1 -ScriptType bash -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 Invoke-VMScript -vm $VM2 -ScriptText $myscript2 -ScriptType bash -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 ## authorized_keys for root
 # get files from guests
 Copy-VMGuestFile -GuestToLocal -Source /root/.ssh/authorized_keys -Destination ./authorized_keys_VM1_root -VM $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 Copy-VMGuestFile -GuestToLocal -Source /root/.ssh/authorized_keys -Destination ./authorized_keys_VM2_root -VM $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 Copy-VMGuestFile -GuestToLocal -Source /home/oracle/.ssh/authorized_keys -Destination ./authorized_keys_VM1_ora -VM $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 Copy-VMGuestFile -GuestToLocal -Source /home/oracle/.ssh/authorized_keys -Destination ./authorized_keys_VM2_ora -VM $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 # Change root to oracle to fix running the script with root credentials
 Get-Item .\authorized_keys_*ora | % {
 (get-content $_).Replace("root@","oracle@") | Set-Content $_ -Force
 }
 # concatenate contents of files
 (Get-Content ./authorized_keys_VM1_root) + "`r`n" + (Get-Content ./authorized_keys_VM2_root) + "`r`n" + (Get-Content ./authorized_keys_VM1_ora) + "`r`n" + (Get-Content ./authorized_keys_VM2_ora) | Out-File -FilePath ./authorized_keys -Encoding ascii
 # copy files back 
 Copy-VMGuestFile -LocalToGuest -Source ./authorized_keys -Destination /root/.ssh/ -VM $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials -Force
 Copy-VMGuestFile -LocalToGuest -Source ./authorized_keys -Destination /root/.ssh/ -VM $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials -Force
 $vicmd = "/bin/vi +':w ++ff=unix' +':q' .ssh/authorized_keys"
 $return1 = Invoke-VMScript -ScriptText $vicmd -vm $VM1,$VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 ## known_hosts for root
 # get files from guests
 Copy-VMGuestFile -GuestToLocal -Source /root/.ssh/known_hosts -Destination ./known_hosts_VM1 -VM $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 Copy-VMGuestFile -GuestToLocal -Source /root/.ssh/known_hosts -Destination ./known_hosts_VM2 -VM $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 # concatenate contents of files
 (Get-Content ./known_hosts_VM1) + "`r`n" + (Get-Content ./known_hosts_VM2) | Out-File -FilePath ./known_hosts -Encoding ascii
 # copy files back 
 Copy-VMGuestFile -LocalToGuest -Source ./known_hosts -Destination /root/.ssh/ -VM $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 Copy-VMGuestFile -LocalToGuest -Source ./known_hosts -Destination /root/.ssh/ -VM $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 $vicmd = "/bin/vi +':w ++ff=unix' +':q' .ssh/known_hosts"
 $return1 = Invoke-VMScript -ScriptText $vicmd -vm $VM1,$VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials 
 ## authorized_keys for oracle
 # copy files back 
 Copy-VMGuestFile -LocalToGuest -Source ./authorized_keys -Destination /home/oracle/.ssh/ -VM $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials -Force
 Copy-VMGuestFile -LocalToGuest -Source ./authorized_keys -Destination /home/oracle/.ssh/ -VM $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials -Force
 $vicmd = "/bin/vi +':w ++ff=unix' +':q' /home/oracle/.ssh/authorized_keys"
 $return1 = Invoke-VMScript -ScriptText $vicmd -vm $VM1,$VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials 
 ## known_hosts for Oracle
 # copy files back 
 Copy-VMGuestFile -LocalToGuest -Source ./known_hosts -Destination /home/oracle/.ssh/ -VM $VM1 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 Copy-VMGuestFile -LocalToGuest -Source ./known_hosts -Destination /home/oracle/.ssh/ -VM $VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials
 $vicmd = "/bin/vi +':w ++ff=unix' +':q' /home/oracle/.ssh/known_hosts"
 $return1 = Invoke-VMScript -ScriptText $vicmd -vm $VM1,$VM2 -HostCredential $HostCredentials -GuestCredential $GuestCredentials 
 # remove temporary files
 if ($Cleanup) {
  Get-Item .\authorized_keys*, .\known_hosts* | Remove-Item -Confirm:$false 
 }
}
Lines 45-57 - The script requires some parameters (two are mandatory). The name of the VM’s that will be configured, their IP addresses, and if you would like to not remove the files created during the process, you should change the $Cleanup variable to $false (by default $true). Also in order to run scripts on the guests you will need to provide credentials for the hosts and the guests (I am assuming that all hosts have one password and also the guests have one password as well).

Lines 59-70 - If the credentials were not provided as variables – then you will be prompted. If the IP’s were not provided, they will be retrieved through the API.

Lines 72-106 - The script that should be run on the guests. There is one for each VM – due to the fact that the IP is (of course) different on each of them.
A bit more details about the script that is run on the guest. 
Lines 73-76 - The guest SSH keys are re-created as I explained above
Lines 77-78 – Create the .ssh directory and create the keys. –N is to set a blank password on the key and –f is for the path. 
Lines 79-83 - The known_hosts file is basically a concatenation of 3 things for each entry:
<hostname>, <IP_Address> <Contents of rsa_key.pub> (The commas and spaces are important!) 
Line 84 - Add the contents of id_rsa.pub to the authorized_keys file. 
Lines 85-87 – Copy the files into the oracle user’s directory and make sure sure the file ownership is correct.
Lines 108-109 – Run the scripts on each VM.

Lines 112-115 – Copy the files to the local computer for text manipulation.

Lines 117-118 – The authorized_keys are per user, and the ones we created for the oracle user were copies of those from the root user, so the username has to be changed.

Line 121 – Combine all 4 authorized_keys files into one, with carriage returns after each one.

Lines 123-126 – Copy the files back to the guests. And as I said above, the files needed some additional vi manipulation because during the copy back – they file type was incorrect.

Lines 129-137 – The same process for the known_hosts file. Take note – only one copy from each guest was needed, that is because it is VM specific and not user specific.The same vi manipulation as well.

Lines 140-149 – The process is repeated to place the files in the oracle user’s home directory.

Lines 151-153 – Cleanup the files – done by default.

2013-01-23

The SSH Key Problem With Cloned Linux VM’s

First let me start off this by saying – the way this is effects you will differ entirely on your organizational procedures and security requirements.

We all love templates – don’t we? I mean they are the best! You configure your VM to your liking, OS patches, company policy settings etc.. etc.. and every new VM that you deploy – will have the exact same baseline.

Standardization… conformity… in the enterprise – all great.

Except.. a short while a go I found out something which is not exactly the best security practice (to put it mildly)

In vSphere you can create a Guest Customization specification and deploy your VM.

For Linux you can enter in some information, but not much

(Today I found out – that you also can run a custom script to configure the VM Configure a Script to Generate Computer Names and IP Addresses During Guest Operating System Customization in the vSphere Client )

But let’s get back on track.

So you have deployed a VM from a template – it now has an IP – and what is the first thing you would most probably do? SSH into the VM – because you now want to start doing the real work (amazing how we take the deployment of an OS for granted these days).

So if this is the first time you are connecting you will most probably get something like this:

Host key warning

Which to explain in simple terms is saying, “Hey – I don’t know this server – here are its details and RSA key. Do you still want to connect?”

And you would usually say – yes and enter your password – and all is fine and dandy.

What this does is add an entry to the .ssh/known_hosts file

known_hostsBut not only did I deploy one VM from this template – I deployed 2. So let’s repeat the process again.

Host key warning

So where is the problem? If you look carefully – you will see that key fingerprint on vm1 is the same as on vm2

30:d6:df:54:ca:26:b2:a4:df:65:e5:33:9f:21:df:55
30:d6:df:54:ca:26:b2:a4:df:65:e5:33:9f:21:df:55

Identical.

And if we would now look into the known_hosts file then you would see this:

image

All exactly the same (each host created two lines for some reason on Ubuntu – usually it is one).

But why is that? Shouldn’t every VM be unique? I mean they have different MAC addresses, different UUID’s, different IP’s – VMware usually takes care of that.

Well it is pretty simple really – when SSH is installed, the OS package usually creates these files for you. But remember we are cloning from a template – after SSH was installed (that will usually be the case).

That now means – that every single VM that was deployed from the template now has the same exactly identical key.

That could be acceptable in your environment – maybe. Maybe not.

But take this example. You are providing vCloud services and your VM’s are spawned from the same templates. All … of… them!!!

Here you could have the same public keys in different organizations – different companies, I am sure you can see how bad this might become from a security perspective.

This can also cause havoc on certain monitoring systems and also will create a number of problems with SSH key authorization.

So how do you solve this? Unfortunately – there is no built-in way to do this with the current functionality in vSphere today. PowerCLI scripts – or other orchestration tools will need to be used to get around this.

What I would personally like is an option to run a guest OS script as part of the deployment process. Yes I know this exists for Windows VM’s today – but there is no such functionality for Linux.

I did a quick check on some of the VM’s in one of the environments I have access to - 50 VM’s

Duplicates

There are duplicates – actually I was surprised to see that there some were actually unique.

Some food for thought… (which reminds me – time for me to go out for lunch).

Thanks to @brian_smi for the background information from his blog post!

It would be interesting to hear your thoughts and comments on how or if this might pose a problem in your environment. Please feel free to leave them in the comments below.

A quick update to this article. Thanks to Erik Bussink

Twitter

At the moment my recommended solution would be to remove the ssh_host_* files on the VM
before you power it down. The files will be recreated once the VM starts up (or a new VM is deployed from this template). Just make sure.. When you power on the template for maintenance – you must remove the files before your power it down again.

Thanks Erik!

2013-01-22

A Major Milestone – My First Million!!

No… Not Dollars, Shekels and not even IRR (of which 1 miliion is worth ~US$40).

Today I passed 1,000,000 Pageviews on my blog.

image

I never for the life of me expected this would happen. It started on November 27, 2007 – just over 5 years ago, with this post Welcome to the blog

A year later – and 16 posts down the line I had whopper of 6 subscribers.

In the beginning..

Today – and about ~500 posts later ..

And Today..

A lot has happened over these past 5 years – for me personally and in the technology world.

Thank you all for the support, the interest and I hope to continue to provide interesting content and insight into virtualization, cloud and automation as I have up until now.

Who knows what the next years will be like??

2013-01-21

Copy-DatastoreItem - Understanding the Traffic Flow

I brought this up on Twitter a while ago.

I studied the traffic flow – and would like to share it with you here, but first here is the architecture of the testbed – which will help explain in more detail

Architecture

Well the environment is pretty simple. One vCenter server, four ESXi Servers – each with a local datastore and a shared datastore among the 4 hosts. My laptop is the one that was running the Copy-DatastoreItem cmdlet.

Copy-DatastoreItem has the following parameters

  • Item - Specify the datastore item you want to copy. You can use a string to provide a relative path to the item in the current provider location.
  • Destination - Specify the destination where you want to copy the datastore item.

The PowerCLI syntax:

Copy-DatastoreItem -Item $ISOPath -Destination $Destination

The first case was copying to a local datastore on ESX3 (x.x.x.173)

So I fired off the command and the ISO started to copy.

But how does the file get from my laptop to the ESX3? From my laptop directly to the host? Some other way?

In order to check this I chose a file that was reasonably large (~3.0 GB) so I could see the network activity. I opened up the Resource Monitor and sorted the columns to see the network traffic and saw the following.

Traffic from laptop

The network traffic was going from my laptop directly to the vCenter Server – using Powershell of course.

Next – from the vCenter – is had to get to the host of course.. – but how?

Here is the Resource monitor from the vCenter Server

Traffic to & from vCenter - local datastore

On the top right what you see is that the vpxd.exe (vCenter server process) is the one that has the highest amount of network traffic.

In the bottom window you can see on the first line that it is sending out traffic to ESX3 (.173) over port 902 and on the second line it is receiving traffic from my laptop (.187) over port 443. This makes perfect sense.

PowerCLI is communicating with the vCenter – and the vCenter is sending the traffic over to the host. So if we were to look at the original architecture again , the traffic flow will look like this

Architecture with flow

Next I tried the same thing, but this time to the Shared datastore.

Based on what we had before the flow should be:

Laptop –> vCenter –> ESX host.

But the question is though – which ESXi host? The destination parameter to where I want to copy the file is as follows:

vmstore:/<datacenter_name>/<datastore_name>
in my case – vmstore:/UCS/VMGuestDatastore

There is no indication to which host this will go to, there are a number of hosts in the cluster and they are all connected to same datastore. So how does the traffic actually flow in the end?

To answer that we need to look at the Resource Monitor on the vCenter again.Traffic to & from vCenter - shared datastore

As you will notice – the same traffic flow – but in this case – it changed to ESX2 (.172) and was not going through ESX3 anymore.

I tried the process a number of times – and the result was always the same, it always went to ESX2 (.172)

I could not find the logic behind this.

It did not take the hosts sorted by Name nor according to HostId as you can see below.

Name and HostId

This one has me intrigued – and if anyone has any ideas – what the logic is behind how vCenter chooses which host the traffic will flow through – please do share.

2013-01-17

Another PowerShell vExpert.me URL Shortner

Building on Jonathan Medd’s excellent idea of Using PowerShell to access the vExpert.me URL Shortener, I decided to improve it a bit more.
Here is the completed script.
<#
 .SYNOPSIS
  Will create a new vExpert.me URL

 .DESCRIPTION
  Using the Invoke-Rest Cmdlet to invoke a creation of a new vExpert.me URL

 .PARAMETER  URL
  URL that should be shortened.
 
 .PARAMETER Custom
  The custom URL that should be used.

 .EXAMPLE
  PS C:\> New-vExpertURL -URL 'http://www.google.com'
  This example shows how to call the New-vExpertURL function with with the URL parameter and generate a random URL.

 .EXAMPLE 
  PS C:\> New-vExpertURL -URL 'http://www.google.com' -Custom this_is_my_link
  This will create a custome URL of http://vexpert.me/this_is_my_link pointing to http://www.google.com
 .INPUTS
  System.String

 .OUTPUTS
  System.String

 .NOTES
  For more information about advanced functions, call Get-Help with any
  of the topics in the links listed below.

#>
function New-vExpertURL {
 [CmdletBinding()]
 param(
  [Parameter(Position=0, Mandatory=$true)]
  [System.String]$URL,
  [Parameter(Position=1)]
  [System.String]$Custom
 )
 begin {
 if (!$($Custom) ) {
  $baseurl_2 = "&action=shorturl&format=json&url="
  } else {
  $baseurl_2 = "&action=shorturl&keyword=" + $Custom + "&format=json&url="
 }
 $baseurl_1 = "http://vexpert.me/yourls-api.php?signature="
 $secret = "xxxxxxxxx"
 
 }
 process {
 $invokeurl = $baseurl_1 + $secret + $baseurl_2 + $URL
 $vExpertme = Invoke-RestMethod -Uri $invokeurl
 $vExpertme.shorturl | clip
 Write-Host "The shortenend URL is now in your clipboard" -ForegroundColor Green
 }
 end {
 }
}
It is quite self explanatory. You will need to enter your personal secret code in Line 47.
So I added 4 things
  1. This is now a function – and it accepts parameters.
  2. One of the parameters is CustomURL which will allow you to enter your custom text if you please.
  3. The output will provide the URL and a success message.
  4. The URL will now be in your clipboard so you can use it.

2013-01-15

Red Hat 6.4 Clustering Changes

I am pretty sure that this one escaped under the radar – because it was not well advertised.RHEL

Red Hat released on December 4th, 2012 their latest Beta of RHEL 6.4 – release notes can be found here.

So what has changed?

VMware PV Drivers

The VMware para-virtualized drivers have been updated to provide a seamless out-of-the-box experience when running Red Hat Enterprise Linux 6.4 in VMware ESX. The Anaconda installer has also been updated to list the drivers during the installation process. The following drivers have been updated:

  • a network driver (vmxnet3)

  • a storage driver (vmw_pvscsi)

  • a memory ballooning driver (vmware_balloon)

  • a mouse driver (vmmouse_drv)

  • a video driver (vmware_drv)

The other interesting tidbit was this:

Support for VMDK-based Storage

Red Hat Enterprise Linux 6.4 adds support for clusters utilizing VMware's VMDK (Virtual Machine Disk) disk image technology with the multi-writer option. This allows you, for example, to use VMDK-based storage with the multi-writer option for clustered file systems such as GFS2.

Up until now – according to this RedHat KB (Virtualization Support for High Availability in Red Hat Enterprise Linux 5 and 6) – the only way you could install a RedHat cluster on VMware was by using either an RDM (Physical/Virtual compatibility mode), a Raw Disk or in-guest iSCSI disks.

With this release it now changes – you can use a regular VMDK which means regular VMFS.

Virtualization Support for High Availability in Red Hat Enterprise Linux

I think that the time is right for VMware put up a page for RHEL clustering support – similar to the one they have for Microsoft Clustering. There is no such page existing today.

**Just as a side note for Red Hat 5.9 – neither of these features were added.**

2013-01-03

PowerCLI Does not officially support Powershell v3

Just a heads up. According to the Release Notes PowerCLI does is not supported

Release Notes

Does this mean that it will not work – No of course not!  From what I have tested it works almost flawlessly – but there are some quirks…

For example - Set-NetworkAdapter returning 'Operation is not valid due to the current state of the object' and Error with Move-VM: Operation is not valid due to the current state of the object.

And as Luc put it, “Well, you can't file a bug for something that is not supported, now can you Smile

Update – February 12th, 2013

VMware have now released an updated version – see the announcement here - PowerCLI 5.1 Release 2 Now Available

This will actually render this blog post obsolete – but I am happy VMware have addressed this issue.

2013-01-02

Invoke-VMScript Failed - and how I was Baffled.

Luc Dekens wrote a great post a while back Will Invoke-VMScript work? about the prerequisites needed in order to get Invoke-VMscript to work. Stop for a minute and go and read his post.

Glad to have back.

As part of an Oracle RAC provisioning script that I am working on – one of the first things I wanted to do was to configure the network settings for my two nodes – with parameters taken from a config file.

Of course if the VM does not have an IP address yet then you cannot configure it through the network, so here is where Invoke-VMscript comes into play. Huh?

A few things first off the bat. My configuration was working also with the 32-bit engine but also with the 64-bit engine as well. The rest of the prerequisites were all there.

So here is what was happening. In the script I had stored the HostCredentials and the Guestcredentials each in a variable. When it came time power on the VM’s,The script would wait for the VMware tools to start running in the guest before executing the command and then run my script inside the guest OS – but the command would fail with this message.

Invoke-VMScript : 02/01/2013 15:17:07    Invoke-VMScript        Error occured while executing script on guest OS in VM
'testdbCA1b'. Could not locate "Powershell" script interpreter in any of the expected locations. Probably you do not have enough permissions to execute command within guest.
At line:5 char:1
+ Invoke-VMScript -ScriptText $bb -vm $dbvm2 -HostCredential $hostcreds -GuestCred ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ResourceUnavailable: (testdbCA1b:VirtualMachineImpl) [Invoke-VMScript], VimException
    + FullyQualifiedErrorId : Client20_VmGuestServiceImpl_RunScriptCore_ExeLookupFailed,VMware.VimAutomation.ViCore.Cmdlets.Commands.InvokeVmScript

Now this was really weird – because this was completely not true. To make it even more baffling – when trying to run the commands manually – not as part of the script – it would work – without a problem. So I was starting to think that perhaps there was a problem that the credentials were not being passed properly down to the command – which was not likely – but still I had no other clue.

I then wondered – Invoke-VMscript – interacts with the VMware Tools in the guest through VIX – so maybe there was a problem there.

So I checked my versions (it was 1.12) and looked at the release notes and saw something there that ultimately led me onto the right path.

VIX

OK I was not getting any of these errors, but my command would not work. So I wanted to see what the logs of the VMware Tools in the guest were saying – after all it was interacting with the guest through VMware Tools. But where was the log?

KB1007873 - Enabling debug logging for VMware Tools within a guest operating system showed me the way to enable VMware Tools logging in a Linux VM.

You need to create a file (if it does not exist) /etc/vmware-tools/tools.conf and add to that file:

log = true
log.file = /tmp/vmtools.log

I then performed the following to test my theory.

  1. Restart the VM
  2. Wait for the tools to report Running and up-to-date
  3. Invoke-VMscript

This I did with a simple PowerShell script.

Restart-VM -VM $dbvm2 -Confirm :$false
Sleep 10
while (((get-vm $dbvm2 ).ExtensionData.Guest.ToolsRunningStatus ) -ne "guestToolsRunning" ) {
Write-Host "....." -ForegroundColor Yellow
Sleep 5
}

Invoke-VMScript -ScriptText $bb -vm $dbvm2 -HostCredential $hostcreds -GuestCredential $guestcreds
while ( $? -eq $false ) {
Get-Date -Format HH :mm :ss
sleep 2
Invoke-VMScript -ScriptText $bb -vm $dbvm2 -HostCredential $hostcreds -GuestCredential $guestcreds
}

So I noticed a few things.

  1. VMware Tools comes up and reports itself as running – way before the OS is actually available – and that you have a console prompt.
    Even before SSH starts
  2. The first 3-5 tries of Invoke-VMscript would fail – with the same error message I had before. And suddenly it would work as if nothing was wrong.

I went to look in the VMware Tools log that I had just configured and there I found something which I find very strange – but did solve my mystery but I still do not have the answer as to why it is happening.

For the failed attempts I had this in the log.

[Jan 02 13:17:06.925] [   debug] [vix] VixTools_StartProgram: args: progamPath: 'cmd.exe', arguments: '/C powershell -NonInteractive -EncodedCommand cABvAHcAZQByAHMAaABlAGwAbAAuAGUAeABlACAALQBPAHUAdABwAHUAdABGAG8AcgBtAGEAdAAgAHQAZQB4AHQAIAAtAE4AbwBuAEkAbgB0AGUAcgBhAGMAdABpAHYAZQAgAC0AQwBvAG0AbQBhAG4AZAAgACcAJgAgAHsAbABzACAALQBsAGEAfQAnACAAPgAgACIALwB0AG0AcAAvAHAAbwB3AGUAcgBjAGwAaQB2AG0AdwBhAHIAZQAwACIAOwAgAGUAeABpAHQAIAAkAGwAYQBzAHQAZQB4AGkAdABjAG8AZABlAA=='', workingDir: '

[Jan 02 13:17:11.988] [   debug] [vix] VixTools_StartProgram: args: progamPath: 'cmd.exe', arguments: '/C powershell -NonInteractive -EncodedCommand cABvAHcAZQByAHMAaABlAGwAbAAuAGUAeABlACAALQBPAHUAdABwAHUAdABGAG8AcgBtAGEAdAAgAHQAZQB4AHQAIAAtAE4AbwBuAEkAbgB0AGUAcgBhAGMAdABpAHYAZQAgAC0AQwBvAG0AbQBhAG4AZAAgACcAJgAgAHsAbABzACAALQBsAGEAfQAnACAAPgAgACIALwB0AG0AcAAvAHAAbwB3AGUAcgBjAGwAaQB2AG0AdwBhAHIAZQAwACIAOwAgAGUAeABpAHQAIAAkAGwAYQBzAHQAZQB4AGkAdABjAG8AZABlAA=='', workingDir: '

But for the successful attempt the log showed (which was what I expected)

[Jan 02 13:17:23.211] [   debug] [vix] VixTools_StartProgram: args: progamPath: '/bin/bash', arguments: '-c "bash > /tmp/powerclivmware0 2>&1 -c \"ls -la\""'', workingDir: '
[Jan 02 13:17:23.211] [   debug] [vmsvc] Executing async command: '"/bin/bash" -c "bash > /tmp/powerclivmware0 2>&1 -c \"ls -la\""' in working dir '/root'
[Jan 02 13:17:23.214] [   debug] [vix] VixToolsStartProgramImpl started '"/bin/bash" -c "bash > /tmp/powerclivmware0 2>&1 -c \"ls -la\""', pid 3792

ID-10081549Now here is the weird part. If you look at the first two failures – you will see that VIX trying to execute a Windows command on a Linux operating system – which…. probably .. won’t…. really…. work…. !!!

Only about 20 seconds later – did it execute the correct bash command – in my case ‘ls –la’ and it worked of course.

So here I found my way around my problem – but have not gotten to the bottom of the mystery yet. I put in an extra sleep statement into the script that would wait a bit longer until the OS was completely up and only then run the Invoke-VMscript command – and all was working fine…

So a few things I learned today:

  • How to enable logging for VMware Tools
  • VIX does weird things.
  • A workaround is as good of a solution as any other.
  • I would add one more thing to Luc’s prerequisites – wait until the VM has completely started before attempting to use Invoke-VMscript.