Monday, 5 August 2013

vCloud Director - Moving vApps Between Clusters In the Same PvDC



 I had a request from a customer of mine to be able to move there vApps and VM's between vSphere clusters in the same vCD Provider Virtual datacenter.  The customers first thoughts were to drop down into the vCenter and then just do a normal vMotion, however they get the standard message detailing that this object is managed by vCD.


This should have made the customer think that maybe doing it this way was not the best idea, But they progressed and migrated the VM anyway.  Once they had done this.  It broke the relationship with vCD and the VM became unmanageable in vCD.

Now if this was a simple migrate a VM/vApp between clusters using the same storage we could go to System>Manage And Monitor> Resource Pools and select the VM from source cluster and the select "Migrate" to move the VM to the destination cluster. 



Now the customer I was working with has been using VBLOCK's in there datacenter to build there vCloud environment.  This is brilliant as the VBLOCK is a very impressive bit of kit and after working with them for a number of months I am very happy with recommending them to customers.  But using a VBLOCK did present us with a challenge.

The customer was using a single provider virtual datacenter in vCD.  This was backed by several clusters each cluster corresponded to a single VBLOCK.  The Clusters were 24-32 hosts.  Each VBLOCK has its own VNX SAN and the storage from this is only presented to the hosts in the same VBLOCK.  There is no storage shred between the VBLOCKs, and thus between the clusters.  

So if we have no shared storage between the clusters backing the Provider Virtual Datacenter. How do we move VMs between the clusters.  The answer is using Storage vMotion! Hang On Storage vMotion is not a selectable option in vCD.

After thinking a little more I decided we could select the VMs storage profile and change this.  This should instruct the VM to conduct a joint "Change Host and Change datastore" migration.

In the example below I have my vApp built on a cluster called "Site1".  This cluster has a corresponding storage profile.  The storage for this profile is only mounted on the hosts in the cluster "Site1"

Now the vApp and the VM must be completely shutdown.  If the vApp is showing a status of "Partially Running" then this operation will fail.  The same applies for if the VM is not shutdown.  You will see an error message of "Invalid Parameter" 

Select a host profile that is linked to another cluster in the Provider Virtual Datacenter.  In my lab this is "Site2" and select OK.

Now the VM will show as being "Busy" while the Storage vMotion Is conducted in the background.  If you drop to the vSphere Client you will see more information. The vSphere client will show the VM as being relocated.
 
Once the Storage vMotion has completed you will see the VM in vCD show as normal and examining the vSphere client will show the VM as being moved to another cluster.

This resolved my customers problem, and we integrated this into a rather cool vCO workflow that could be kicked off from the customers Cisco based cloud portal.  The workflow I may detail in another post but it basically looked at the VMs inside a vApp and then changed there storage profile to relocate them based on the input from the user.

My customer then requested that this activity of moving between the clusters was conducted with no downtime.  The ONLY way this is possible is to use a SWING datastore.  This is a datastore that is presented to all the hosts in the Provider.  This breaks a number of VCE design constraints and would result in having to have a datastore equal to the size of the largest VM.  In my customers case this is 8TB.  So this 8TB is going to be sitting there doing nothing most of the time, as a VCE design constraint is not to share storage between VBLOCK's let alone run workloads on it.

I am working to resolve this at the moment but I am not sure I will find a resolution. We are constrained by the CBT technology used in Storage vMotion and the design constraints of VCE.



Wednesday, 12 June 2013

vCloud Director - Set Administrator Password and then you cant login?

Hi All

Had a very strange problem today, I was delivering a vCD POC.  I left the database creation to the user and we just connected to it.  I provided them the scripts that located in the Installation Guide. We were provided a SQL user account as normal and ran the ./configure script and connected to the database with no problems.

Once we had completed the configuration we connected the web address for the cell and went over the initial configuration.  We set the password for the administrator and completed the wizard.

After this we then tried to login to the cell

User: administrator
Password: ******** (same as was set in the wizard)

Result = Access denied?

Very strange, as i was 100% certain the password was correct.  So I asked the SQL admin to run the following to rest the cell to a un-configured state. update config set value='false' where name = 'vcloud.system.initialized';   Once this was done we went over the initialization of the cell again and went set the same password, just incase I and typed the password incorrectly.  Still had the same Access Denied message.

So I ran the above SQL query again to reset the cell.  I then asked to have a look at the SQL management studio.  The DBA let me have a look and I could see he had made the account being used by vCD a sysadmin! this was not in the scripts in the install guide so it must have been done after.  Now I have seen strange things when sysadmin accounts have been used in previous VMware products for DB access.  Traditionally VMware products have suggested giving the accounts sysadmin access at install and then lock down after, but most the newer releases give more info on user rights needed so we no longer need to give sysadmin rights at all.

I removed the sysadmin privilege and reconfigured the cell with the same password as the last two times. Amazingly at the login screen I was then able to login.

Now I have no idea why this resolved the login problems and I have asked if I can log a bug internally to see if this is as designed or if it is a small bug.  I was using 5.1.2 vCD binaries on SQL 2008 Express, again I dont know if this had an affect on what was happening.  I just wanted to post this incase anyone has a similar problem, hopefully this will help.

Tuesday, 11 June 2013

vBrownbag's and Professionalvmware.com

Hi again

I wanted to let everyone know I have been hosting some sessions on the popular community site. professionalvmware.com I have covered 2 objectives from the VCAP-CID blueprint and I am planning on hosting a networking one soon to go a little more in-depth into the networking virtualization we introduce in vCNS and vCD.

Here are the links to my two vBrownbags.  Be sure to add me and the vBrownbag crew on twitter.

Phil Monk  vBrownbags

#vBrownbag Follow-Up Phil Monk Covering VCAP5-CID Objective 4 – vCloud Security

#vBrownbag with Philip Monk covering VCAP5-CID Objective 3

Thanks
Phil

Please Leave Feedback..... I Would love to hear what you think, good or bad

Hi Everyone

I have had some impressive views on some of the stuff I have published, especially the SRM RecoverPoint article I posted.  But next to no one leaves any feedback.  :-(  Good or bad, or even a follow up question.  Please feel free to ask me anything.

Many thanks
Phil

CBT - Change Block Tracking and its common miss interpretations.

I have been working on a vCloud project for the last few months and it has now come to looking at the backup of the solution.  Commvault is the existing backup vendor and is going to continue to provide backup services in this Private cloud.

Now the customer came to me with an interesting question about how CBT worked, they wanted to understand how the changed blocks in the vmdk were tracked.  This was partly down to there thirst for information and partly because I think they were concerned at contraints that might be seen with IO and CPU.

The perception of most people is that CBT creates snapshots to record all the information in the snapshot and then it is consolidated, much like the usage of a normal snapshot in VMware. This is incorrect and although I am no storage expert I will attempt to explain how it works from my VMware background.

How it actually works is by keeping track of the blocks that have changed in a vmdk based on significant disk events recorded for a specific vmdk.

Simple right?  Well the next question from my customer was "Well how does it do that? And is it going to cause me pain on my storage IO and/or my hosts"  - This is a valid question as recording all this information has to have some overhead.

So how does it do it? Lets use the most common usage of CBT, Backup of a Virtual Machine.
  1. We take a full Backup of a VM with CBT enabled on all disks.  This takes a snapshot to record all IO, while the backup is being processed on the original vmdk.  The backup application records the Change Clock timestamp (T1) at the time the snapshot is created to facilitate this backup. After the backup is completed normal snapshot consolidation is conducted and  the changes in the delta are merged into the original vmdk.
  2. When the next backup is started 24 Hours later, another snapshot is taken of the VM's vmdk's.  The backup application records the Change Clock timestamp (T2) at the time of creating this snapshot. All changes are written to the delta vmdk while we backup the changed blocks in the origonal vmdk.  We still create a snapshot as we need to have access to the original vmdk.
  3. The backup application then backups up only the blocks of the original vmdk that have changed between the two time stamps (C).  
Now the next part of the proccess I wanted to examine was "How does it know what blocks are changed"

When we enable change block tracking, detailed here in KB1020128 and the VM is powered on for the first time. Have a look in the Datastore Browser and the VMs home directory.  You will see a file named "vm_name-ctk.vmdk" this file is used to track the blocks that are changed on a given vmdk.  This auxiliary file has a pointer configured in the vmdk to point to this file to track changes made to blocks in the correcponding vmdk.


So when step three, listed above, is performed the auxiliary file is used to identify blocks based on the two time stamps, and the backup application then performs a backup of the changed blocks.

Now the change clock is not based on a normal clock and utilizes a Unix Epoch based clock,  Now I dont work in engineering and I am under NDA from VMware (being an employee) so I cant share any more information around the nitty gritty details.

But stepping back, my customer wanted to know if this was going to cause them any pain with IO and/or CPU on the hosts.  The simple answer is that the impact of using CBT is minimal.  It will maybe cause the kernal to in use 1 or 2 % more CPU per 20 or 30 VMs. However it is important to remember that the small increase the kernal sees now will help reduce the IO and/or CPU increase if the VM was being fully backed up every night because CBT was not being used.  backups will also be much shorter as well due to smaller amounts of data being backed up.

Hope this has been helpful, many thanks.

Phil


Thursday, 30 May 2013

VMware Site Recovery Manager (SRM), EMC RecoverPoint - No Array Pairs

So I have been doing a mixture of many things after joining VMware.  Mainly vCloud Director and vCloud Networking and Security based projects.  However I was tasked with a SRM plan and design for a medium - large (400VM) company.

During this peace of work I hit something that I thought I should highlight as I could only find one article on the problem posted here from our friends over at Xtravirt.  The post was more of less identical to what I had done to resolve the problem but, I was searching to see if there were any additional activities I may have missed that could have impacted my customer.

So what was the problem?

Well I was installing SRM, Configured all the resource mapping for failover and fail-back  Installed the SRA for EMC RecoverPoint, paired the SRA's and run a discovery.  And I found NO paired datastores as shown in the screen below.


So I asked the storage admins to check the policy for the consistency groupes in RecoverPoint.  The customer had a look but was new to EMC and RecoverPoint so I logged on check this for them.

This screenshot is from the Xtravirt blog as I could not use a customer one and dont have any RecoverPoint appliances configured.  Hope Xtravirt dont mind.


When I logged on the Policy was set to "Group is in maintenance mode, It is managed by RecoverPoint , SRM can only Monitor" This in not so many words means that SRM can not action anything to the consistency group and so any Failover and/or test Failover activities can not be performed by the SRA being used with SRM.  As a result SRM classes the consistency group as an object it has no control over and so does not show it in the SRM Array Managers.  Change the Option to "Group is managed by SRM, RecoverPoint can only monitor"   and this changes this behavior   This allows the SRA full control over the LUNs in the consistency group. 

Now that option is configured on the RecoverPoint appliance, go back to SRM and refresh the SRA making sure it is paired with the remote site first.  You should see all the paired LUNs in the display Window now.



Friday, 22 March 2013

vCloud Director - Building Skills

After moving to VMware I have been doing a massive amount of vCloud Director work.  I am working  with a VMware, Cisco and EMC based partner (I am sure you can get the name) working with a VMware colleague Magnus Anderson (http://vcdx56.com/) designing what will be one of the UKs biggest vCloud installations.

I have had good vCloud skills for a while now but working on a project that will be as large as this one and with the level of automation  from vCenter Orchestrator and Cisco Cloud Portal (CCP) has improved my skills massively. 

I have also been working with the guys over on Profesional VMware and have presented on the EMEA Brownbags.  I am due to present on another topic in the near future.  I would jump onto the Brownbags if your studying for the VCAP-CID and the soon to be released VCAP-CIA as they have helped me no end in refreshing and improving my knowledge on many topics.  One of the best BrownBags I have watched is @LawrenceKohan BrownBag in vCenter Change Back Manager.  This is a product I configure in every project but it is something I often need to refresh my knowledge on and Lawrence's knowledge of the product is outstanding. 

Anyway I will be updating the blog a bit more often now.  As I have some interesting stuff coming up that I would love to share.