Just a small reminder for the people that live in my area. On Friday, June 7th, the German VMware User Group (VMUG) west will be meeting up at the EMC office in Schwalbach (click here for a PDF with the address and route). In case you don’t know what the VMUG is for, here’s a quick summary:
The VMware User Group (VMUG) is an independent, global, customer-led organization, created to maximize members’ use of VMware and partner solutions through knowledge sharing, training, collaboration, and events.
The beauty of it? It’s something set up by users for other users. That means that people come to these events to get information that is vendor neutral, and have the ability to talk freely to others without having to fear that someone is trying to only give them the marketing pitch. Or at least, that is what it should be like.
So, the Germany West VMUG Meeting is at Friday, June 7, 2013 at the following address and time:
09:30 – 16:15
EMC Deutschland GmbH
Am Kronberger Hang 2a
65824 Schwalbach/Taunus
You can use this link to register for the event, free of charge, and get to see talks on VMware Nicira, “VMware Network & Security” and other security related topics.
And one important thing to note. The VMUG is a community set up by VMware customers for VMware customers. To exchange ideas, exchange common issues or worries, learn and get to know others in the community. If you feel like you can contribute, submit a proposal for a talk, or suggest a topic for the next VMUG. The more people that participate, the better a VMUG gets!
I’ll be there, and I’m looking forward to seeing you there!
I present a lot on vCenter Operations Manager, a pretty neat monitoring tool from VMware. I like this tool a lot, because getting started with it is easy enough, and you have a plethora of features once you dive deeper in to it, and the best part about it? If you use the “big” version, -Enterprise or Enterprise Plus that is-, you can even monitor your applications and non-virtualized infrastructure. To monitor your things that go beyond your virtual machines, you can install so called “adapters”. In a nutshell, such an adapter is nothing more than a piece of software that tells vCenter Operations how to connect to things, and how to interpret the results it gets back. Now, EMC has created such an adapter for their VMAX and Symmetrix storage arrays, and has created a document that tells you how to set up and configure the adapter. That way, you can get loads of information from your storage system inside of vCenter Operations. Great stuff, right? Yeah ok, maybe not so great. The biggest problem, is that the documentation seems to have been created for the normal installable version of vCenter Operations. However, VMware has also created a version in the form of an appliance, a so called vApp. You download the files, deploy the vApp, enter the IP-addresses of both virtual machines that are contained in the vApp, and away you go. Wonderfully easy to install, and besides certain limits in scalability, it offers pretty much the same functionality as the normal installer. This is where the problem starts if you want to use the EMC Symmetrix adapter. You can find almost all adapters on the Integrien FTP site, and there’s a folder containing all the files you need to get started with the Symmetrix adapter right here. My teammate Matt Cowger actually wrote a nice blog post on how to configure and set up the Symmetrix adapter. This works like a charm, except for one tiny thing that you will run in to when using the vCenter Operations vApp. When you go to create an adapter instance, you need to give it a name, indicate if you want to auto discover everything, and you need to input a path to the “EMC Symmetrix Main Input Folder”. This is the folder where you actually archive all of the performance and configuration data from your storage system. The documentation tells you that this should be:
* If the main input folder is on a remote Windows machine, you must share the folder before you add the adapter instance. Do not map the main input folder. Windows services do not work with mapped drives. * If the main input folder is on a remote Linux machine, you must mount the folder to the Collector server before you add the adapter instance.
Problem being, that if you actually have your Solutions Enabler host running on Windows, you need to input a UNC path in the format of \\servername\sharename. But the problem here is that the virtual machines inside the vApp do not come with any access methods for Windows shares. You won’t find any tools like mount.cifs, smbclient or even have the option to specify smbfs as the type of file system to mount. And that means what? Well, you will have two options to overcome this situation. You can either install the Services for Unix/Services for NFS on your Windows host and set up an NFS share on your Windows machine. Or, you can migrate your Solutions Enabler host to a Linux machine and set up everything there. OK, so how do I configure this stuff under Linux? Glad you asked. You can follow some of the steps from the post that Matt created, but I’m going to write them down here anyway so you will have one page with all the steps you need. I’m going to assume that you have already set up your Linux machine, and that you have installed the Solutions Enabler package. Go in to the following file: /usr/emc/API/symapi/config and add these following lines at the end of the file, then make sure you save your changes (create a backup of the original, this is always a good idea):
Check if the daemon is up and running again by issuing the following command. The first line should show the Daemon State as “Running”:
/opt/emc/SYMCLI/bin/stordaemon show storstpd
Now, since the Analytics VM will be actually collecting the information from the adapter, it needs to be able to access the files from your Solutions Enabler host. Since the Analytics VM will be running the collection process as a user called “Admin”, we need to consider something. The admin user on the vCenter Operations appliance will be running with a user ID (UID) of 1000, and a group ID (GID) of 1003. That means that we should either install our Solutions Enabler using a user with the same user ID and group ID, or we need to map some things so that the admin user can actually access the files later on. In order to export the directory with the required files for the Symmetrix adapter, we will add the following line to /etc/exports: /usr/emc/API/symapi/stp *(rw,insecure,all_squash,anonuid=0,anongid=0) Obviously, this isn’t the best you can do from a security perspective, so feel free to change these options as needed for your environment, but basically what we are doing here is this:
The * just means that all IP-addresses have access. You can change this to for example the IP of the analytics VM.
RW means that the export is created with read and write access.
Insecure means that clients can use non-reserved ports.
All_squash means that all users get mapped to the anonymous user account
anonuid=0 means that the anonymous user ID will get mapped to the user ID 0. Be careful since this is the root account!
anonguid=0 means that the anonymous group ID will get mapped to the group ID 0. Again, this is the root group!
If you did install your Solutions Enabler as a different user, make sure that you map the anonuid and anonguid to the respective numerical IDs, to allow access to the files we are going to export. Now, we simply restart the NFS server, or have it re-read its config should it already be online, using:
/etc/init.d/nfsserver restart
or
exportfs –ra
We can check if the export is working, using the following command:
showmount –e localhost
Now, we create a scheduled job to archive the Solutions Enabler file. To do that, add the following line to your crontab: 2-57/5 * * * * /opt/emc/SYMCLI/bin/stordaemon action storstpd -cmd archive This will cause the job start at 2 minutes past the hour, and run in 5 minute intervals. Check under /usr/emc/API/symapi/stp/ttp, to see if you have a new directory. Normally the directory should be the serial number of your storage array, and contain compressed files inside of that directory that contain the information the Symmetrix adapter will need. Final thing to do right now, is log on to the analytics VM, and create a folder where we will mount the required files. For example create a directory called /media/VMAX. Once you have created the directory, edit /etc/fstab to contain the following line: 10.10.10.10: /usr/emc/API/symapi/stp /media/VMAX nfs rw,lock 0 0 Make sure you change the IP address to match that of your Solutions Enabler host, and then mount the directory using the following command:
mount /media/VMAX
If you don’t have a firewall blocking communication, you should now be able to traverse the subdirectories and access the files. Finally, you can now configure the adapter, and input the directory you just mounted as the “EMC Symmetrix Main Input Folder”. So, in the text field, simply enter the following as the path:
/media/VMAX/ttp
If you test the adapter now, you should see it come back successfully, and after giving it a bit of time, start working with the data you are now importing from your VMAX/Symmetrix system. 🙂
That’s a mouth full of abbreviations for a title, isn’t it?
So, let me give you some background info. VMware introduced something called the vSphere Metro Storage Cluster, and Duncan Epping talks about this feature here.
What the vMSC allows us to do, is to create a regular stretched vSphere cluster, but now also stretch out the storage between the two clusters. This can be done in two ways (to quote from Duncan’s article):
I want to briefly explain the concept of a metro / stretched cluster, which can be carved up in to two different type of solutions. The first solution is where a synchronous copy of your datastore is available on the other site, this mirror copy will be read-only. In other words there is a read-write copy in Datacenter-A and a read-only copy in Datacenter-B. This means that your VMs in Datacenter-B located on this datastore will do I/O on Datacenter-A since the read-write copy of the datastore is in Datacenter-A. The second solution is which EMC calls “write anywhere”. In this case VMs always write locally. The key point here is that each of the LUNs / datastores has a “preferred site” defined, this is also sometimes referred to as “site bias”. In other words, if anything happens to the link in between then the storage system on the preferred site for a given datastore will be the only one left who can read-write access it.
The last scenario described here is something that obviously can cause some issues. EMC tried to address this by introducing the “independent 3rd party”, in form of the VPLEX Witness. Some documentation states that this witness should run in a 3rd site, but I would recommend to run this in a separate failure domain.
In essence, we have created the following setup:
Awesome stuff, because we can do new things that weren’t quite possible before. Since VPLEX is one of the key storage virtualization solutions from EMC that allow us to perform an active/active disk access, we can perform a vMotion between the two sites, and due to the nature of VPLEX, we also perform a sort of storage vMotion on the underlying disks. That, without you having to shut down the VM to do both things at the same time. Pretty neat!
Now, as Chad describes here, a new disk connectivity state was introduced with vSphere 5, called “Permanent Device Loss” or PDL. This was a great feature to communicate to your infrastructure that a target was intentionally removed. You could unmount the disk, and remove the paths to your target in a proper way.
It was also useful to indicate an unexpected loss of your target, indicating that your cluster is in a partitioned state. The problem here was that a PDL state and VMware HA didn’t work so well together. When you had an APD notification, HA didn’t “kill” your VM, and your virtual machine would usually continue to respond to pings, but that was about it.
Then along came vSphere 5 Update 1, which allows us to set a flag on each of the hosts inside our cluster, and set a different flag for our HA cluster. Now, we can actually use HA and see terminate the VMs and have it restart the virtual machines on the hosts in our cluster that still have access to their datastores in their respective preferred sites.
I’ve created a short (ok, 8 minutes) video that show exactly this scenario. You’ll get a quick view of the VPLEX setup. You’ll see the Brocade switches that will change from a config with the normal full zoneset, being switched to a zoneset that will disable the inter-switch links between both VPLEX clusters. And you’ll see the settings inside of my vSphere lab setup, with the behavior of the hosts and virtual machines.
Since I’m quite new to creating videos like this, I hope the output is acceptable, and the video is clear enough. If you have any questions, feedback or would like to see more, please leave me a comment and I’ll see what I can do. 🙂
Just a quick modification to my post, since it wasn’t actually VM-HA (or VM monitoring) responding to the PDL event, but HA terminating the VM when running in to the PDL state, as Duncan pointed out to me on Twitter. Sorry for any confusion I may have caused!
I have to admit it. I stole, or rather “borrowed”, part of this title from a blog post of a colleague of mine, Erik Zandboer. He just now published a post on the mindset behind VAAI, and what the actual effect is on the array itself, and on your vSphere infrastructure.
VAAI was already available in vSphere 4.1, and with the switch to vSphere 5 some new features were introduced, which means that as of this release, we now have the following situation:
Block:
File:
HW accelerated Zeroing
NFS – Full Copy
HW accelerated Copy
NFS – Extended Statistics
HW accelerated Locking
NFS – Space Reservation
Some folks will say that I left out Thin Provision Stun, which is true. And while it does help to resolve some issues, I left it out because I don’t really view it as a hardware offload, which is what I’m trying to focus on.
I took the hardware in our lab, – a EMC VNX 5300 -, for a spin in our vSphere 5 setup to show the same thing as Erik showed in his blog, but instead showing off some of the File / NFS accelerations.
To get the VNX to actually support NAS VAAI offloading and get the result you expected, you need to meet the following prerequisites:
vSphere 5 – You need vSphere 5 installed with an Enterprise or Enterprise Plus license
VNX OE for File 7.0.35.x – You need your VNX Operating Environment for File to be at least at version VNX OE 7.0.35.x or newer
NFSv3 – The offloads only work on NFSv3-based datastores
The vSphere NFS VAAI offload plugin which is referenced here
If all those prerequisites are met, you should normally be able to go in to your vSphere Client and see Hardware Acceleration as Supported:
You could also enable SSH for your ESXi host, – do this by going to the individual host, click on the “Configuration” tab, select “Security Profile” and start the SSH service -, and check the support from the command line. For block devices you could enter the following command:
esxcli storage core device vaai status getand get back a result that shows you the NAAID, the VAAI plugin name, and the primitives with their support state. By using the following command: esxcli storage core device list you get a similar output, but again this only works for block devices, and won’t really help you when checking the support for NFS. I haven’t found any way so far to get a reliable statement back via SSH, but I’ll try to continue looking and update this post if I find something.
In case of the VNX, we can actually check on the array itself to see if we are using the primitives, so I’m actually showing you the output from the array itself, using the following command on the VNX:
server_stats server_2 -monitor nfs.v3.vstorage -type accu -i 1
First off, I went back in to my ESXi host and went in to the NFSv3 datastore that was hosting my virtual machine. In this case, a Windows 2008 server, running an SAP Enterprise Portal, and I used the vmkfstools to create a clone:
vmkfstools -i GI-C-SAP-EPBW.vmdk CLONE-GI-C-SAP-EPBW.vmdkand I set off a snap using a similar command: vmkfstools -i GI-C-SAP-EPBW.vmdk CLONE-GI-C-SAP-EPBW.vmdk. All the while, I had the VNX command that I posted before running in a different window. The output from the VNX was showing that we are actually using the VAAI NFS offloading functions:
server_2 NFS VAAI op VAAI Op Calls VAAI Op Total uSecs VAAI VAAI Op
Timestamp Op Max Average
uSecs uSec/Op
09:07:14
09:07:15
09:07:16
09:07:17
09:07:18 vaaiFastClone 1 0 0 0
vaaiVxAttrs 3 0 1 0
vaaiRegister 5 0 0 0
09:07:19
.......
09:08:27 vaaiOffloadStatus 1 0 0 0
vaaiVxAttrs 7 1 1 0
vaaiRegister 10 0 0 0
09:08:28
09:08:29
09:08:30
09:08:31
09:08:32 vaaiOffloadStatus 2 0 0 0
server_2 NFS VAAI op VAAI Op Calls VAAI Op Total uSecs VAAI VAAI Op
Summary Op Max Average
uSecs uSec/Op
Minimum vaaiFullClone 0 0 83308 -
vaaiFastClone 0 0 0 0
vaaiOffloadStatus 0 0 0 0
vaaiOffloadAbort 0 0 0 -
vaaiVxAttrs 0 0 1 0
vaaiReserveSpace 0 0 0 -
vaaiRegister 0 0 0 0
Average vaaiFullClone 0 0 83308 -
vaaiFastClone 1 0 0 0
vaaiOffloadStatus 0 0 0 0
vaaiOffloadAbort 0 0 0 -
vaaiVxAttrs 3 0 1 0
vaaiReserveSpace 0 0 0 -
vaaiRegister 5 0 0 0
Maximum vaaiFullClone 0 0 83308 -
vaaiFastClone 1 0 0 0
vaaiOffloadStatus 2 0 0 0
vaaiOffloadAbort 0 0 0 -
vaaiVxAttrs 7 1 1 0
vaaiReserveSpace 0 0 0 -
vaaiRegister 10 0 0 0 (sorry for the formatting, I couldn’t get it to show the way it should).
Once the files are created, use a: vmkfstools --extendedstat GI-C-SAP-EPBW.vmdk on the source file, or on the snap or clone to actually display the extended statistics. The “Capacity bytes” show the allocated space for the virtual disk, the “Used bytes” displays the blocks used for the virtual disk which in case of our snapshot is the fast clone and it’s parent. The “Unshared bytes” displays the usage of the actual fast clone itself without the parent.
I should point out that the offload did speed up my full clone operation, but it was “only” in the range of 20%. That isn’t a great deal, but using both esxtop and the vSphere Client performance graphs showed that the ESXi server was busy doing what it is supposed to do: virtualizing my resources! And that’s the most important thing, isn’t it?
A lot of folks out there use the VMware vCenter SRM to create and manage disaster recovery scenarios for their virtualized environments.
Besides having a button to click to fail over (parts of) your environment to a different site, it has one benefit: It forces you to think about your systems. You need to consider which systems are vital to your infrastructure, and you need to be aware of dependencies that you may have in your environment. There are numerous other things that SRM can help with, but that’s not what I wanted to highlight here.
A couple of days ago, I was at the VMware office in Munich, and was helping setting up a SRM 5.0 demo that would serve as a hands-on lab for people interested in SRM. The base of this SRM installation is a virtualized Isilon cluster, that offers the ability to easily provision storage, and offers replication between sites (a quick video overview by my colleague Nick Weaver can be found ).
While setting up the Isilon SRA which you can download from the VMware website, I ran in to a problem. When you download and extract the actual SRA, you’ll get a bunch of PDF files, and two executables. One is the installer for the actual storage replication adapter. It’s called “EMCIsilonSRASetup_1_0.exe”, and you need a current Java development kit to get that one running, but it should install correctly.
The second file is called “IsilonReplicationHelperSetup.exe”, and this is used to configure the SRA before using it in SRM. Now, when starting this helper, both me and Jase McCarty have seen errors that refer to a missing Java class (com.izforge.izpack.installer.Installer), for a program called IzPack which was used to create the installer. After extracting the actual executable, it seemed like some classes/libraries were missing from it.
I’ve been in touch with Isilon support after running in to the error, and after checking with them, they gave me an MD5 hash of a working copy of the IsilonReplicationHelperSetup.exe, which is: 416535bc1c7d7f133037af04b5502e3b However, MD5 for the executable that I got was: 4342E880A99EE2ED6DA1205F1018233DWhich obviously is different. The MD5 of the downloaded file, and the MD5 that VMware shows for the actual zip that contains the SRA matched up though.
So, I’m putting this post out there as a word of warning. It seems like one of the Isilon SRA files on the VMware website is non-functional. Should anybody out there see this, make sure to contact Isilon support and reference case 00169080, which is my case number.
I’m still working with the Isilon support to see what the next steps are going to be, and I’m sure this is going to be resolved soon enough, but I wanted to put this information out there for you in the meantime, to avoid people having to go through the same process as I did. It might save some folks a bit of time. And I’ll make sure I update this post when I get a solution from the Isilon support team.
Update – January 16th 2012:
While I’m still working with the Isilon support group to get everything sorted out, I did get a version of the IsilonReplicationHelperSetup Java archive that seems to be working. Now, I’m sharing this with you all while we try to get things resolved, and to get the working download on the VMware site, but I need to add a large disclaimer:
This file is not officially supported by EMC and/or Isilon, and while this file worked for me, your mileage may vary, and I would recommend that you do not use this file in a production environment! The file might work in a test environment, but please refrain from using it in a productive environment. Use the official files from the VMware download site, or create a case with Isilon and/or VMware support!
Now, to help you verify this file, the MD5 for the Java archive is: FFAC907E70FD0BFC73076793B9D5FCB4and you can get the file here.
Update – February 10th 2012:
VMware has updated the Isilon SRA file, and the new MD5 for Version 1.0, (released 01/18/2012) currently is:d8b8408ab259d64ee3f5a83486e2a25eThis actually contains the working files, so you should be all set. 🙂
So folks, here’s a shameless copy of a blog post from one of the guys on my team. Dave was just brilliant and actually created a virtual storage appliance of the EMC VMAX. I think that’s downright awesome, and I wanted to help him get attention for what he did, so I asked him if I could copy his blog post, which is what you will find here:
As the title suggests there is indeed a Symmetrix VMAX VSA. I have been working on this project since shortly after EMC World. As I look back through my emails, I received the code on 6/3/11 and I have been working on it in almost all of my free time since then.
Now finally it will make its public debut this week at VMworld 2011 as part of the EMC Interactive Demo booth on the show floor. As part of its grand unveiling I thought I would tell you a little about what makes it work.
Now to make a few things clear up front, this is a science project, I cannot distribute it, it does “work”. As part of the lab (I will publish the guide) the student actually provisions an iSCSI disk from the VSA to a ESXi 5.0 host.
One of the first things I noticed with the code when trying to virtualize it. It’s HUGE. There are 2 parts to the VSA.
1. The Service Processor (SP). In a physical VMAX this is the 1U server that is racked in the system bay. It has a special image of Windows XP and contains all of the proprietary software used to manage a VMAX. If you own a VMAX this is what you will see EMC field service personnel using when they come to work on your system. This is NOT accessible by a end-user as it requires special RSA credentials that change weekly. (one reason we can’t distribute it). Its specs are 2vCPU and 2GB of RAM and about 10GB of disk space.
2. Enginuity. This is the Operating Environment of the Symmetrix. For the purposes of this VSA it runs in a SuSE Enterprise Linux 11VM. One of the big deals with the VMAX was that Enginuity was ported from a PowerPC CPU to a Intel x86 based architecture. Without this change this VSA would never exist. Now this VM is big, so big as a matter of fact i had to use a RC build of vSphere 5 in order to even get it to work. I was finally able to scale it down a bit, but at one point it was using 32 vCPU’s 92GB of RAM and about 250GB of disk space.
Obviously one of the challenges for using this in a lab is that I needed it to use fewer resources. In the beginning this VMAX was a Single Engine model, which means it had 16 “slices” running. Each director has 4 DA (backend) directors, and 4 FA (front end) directors. I quickly found this was the biggest reason i needed so much memory and CPU. After working with one developer Chakib, who totally rocks by the way. We were able to scale this down to 1 FA and 1 DA per director. One interesting side note, when I was going down this path I asked Chakib what kind of VM he was using to test this. His reply was, “I am not using this in a VM, I have a physical Linux box with 200GB of RAM”. So I clearly had some work to do. But in its current state it uses 8 vCPU and “ONLY” 48GB of RAM. Which is still pretty darn big, but a lot better than it was when we started.
The networking requirements are pretty simple, the SP needs 1 Public NIC so that we can use its management tools. 2 Internal NICs which is used for internal communication to the directors. In our case that’s the Linux VM. The Linux VM needed the 2 internal NICs and 1 NIC to present an iSCSI target to. Then we put out ESXi host’s VMkernel NIC on the same vSwitch so it can use the iSCSI target provided by the VSA.
So that’s all great you say, but what actually works? That’s a good question.
What works is using Standard Devices, and very small ones today. One of the things I was told when I was given the code was that this WON’T and CAN’T do any I/O. Which obviously proved to be a bit of an issue. Chakib really worked his butt of to get me something that does I/O. So this is not like the Celerra UBER VSA by @lynxbat, where you can run a VM off of it. We hope we can do that one day. Thin Pools work to the extent you can create them, and put devices in a pool, but when you present it to a host it will not work. This kept me from using the VSI SPM plugin for vSphere as part of my lab, hey we always have next year! The really neat part to me is that the internal tools (SymmWin) that run on the SP fully work. It’s like having an actually VMAX, but without all the fuss of getting a few 50A power drops. As an ex-customer this to me is the coolest part, I got to put on my own BIN files, use Inlines (internal tool used to directly talk to the hardware). As a total nerd this thing is a dream come true.
So what’s next?
Well a lot of that depends on YOU! Since this is a total science project we need to show those in Symmetrix Engineering this is worth putting their time and money into. I need everyone here at VMworld this week to come try this thing, give me feedback, leave comments here, and if you aren’t at the show, express your desire for us to continue working on it. If no one is interested this will ultimately die on the vine. Please fill out this form so we can show how many of you all would like to see this project continue.
I have to give special thanks to Chad Sakac (@sakacc), Chris Horn (@horn_Chris) for getting me involved in this project and letting me run with it. Also all of the support they gave me during this process.
Here is a link to the lab guide being used this week at VMworld. Take a look and let me know what you think!
Big thanks to Matt Cowger (@mcowger), Scott Lowe (@scott_lowe), and Tee Glasgow (@teeglasgow) for their help with the lab guide. Also to Rick Scherer (@rick_vmwaretips) for the blog help
I’m sitting on board of a delayed Airbus A320 from Las Vegas to Chicago while I’m writing up this small post.
I left for Las Vegas must over a week ago to help set up gear for EMC World, and my week has been amazing. After landing inVegas, I met up with some of the guys, went to dinner and to bet, just to be wide awake at two in the morning due to jet lag. Ain’t that just the way it goes?
Anyway, after attending some conference calls, the rest of the team was up and that meant it was time to get started setting things up. Part of it was wrapping up the guides that were created for the labs, and converting them to a format that was suitable for the lab guide reader that was created by fellow vSpecialist Nick Weaver.
Part of it was also to get the backup system up and running on site, because even though we ran the entire vLabs off of the infrastructure located on the other side of the USA, we needed a backup system that would be able to support the labs in case of issues. And one of the first challenges was to actually get the back hardware to the convention center, and actually getting the truck to the right spot to offload the hardware wasn’t as easy as it sounds. Not to mention the fact that it’s hard to work when people actually relocate you from room to room on what seemed to be an hourly basis.
But, all worked out in the end. We were able to get everything loaded and working, and actually started setting up shop in the vLabs area I the convention center. 200 WYSE consoles, custom written lab management software, almost 20 different labs to pick from, and a team of well over 25 folks helping out assured that the vLabs were a good experience.
Did it all run smoothly? No, not all of the time. If you eat your own dog food, want to be bleeding edge and run your service in the cloud, you are bound to run in to some glitches and hiccups. Think of things like firewalls that get in your way, switches not cooperating, or even something like a simulator that tends to crash more than it works (I’m looking at you RecoverPoint 3.4 Sim….). Then, there’s also the fact that this was a first for us in such an environment and at such a scale.
And even with those things, we managed to do an incredible job. We had a great team on site that pulled all nighters to get our environment up and running. Some of us have been living off of about 3 hours of sleep per night, but we still managed to provision over 3000 VMs (exact numbers to be published soon).
And besides the hard work we also had a lot of fun. We got to play craps with two Elvi, perform a pit stop for Wayne and Garth (errr, Chad and Wade), and the random Hangover quotes on our lab headsets were always good for a chuckle.
And now, I’m flying back home and am feeling somewhat melancholic. It’s been a hell of a ride, and coming down from the chaos, or not getting to hang out with my colleagues until I see them again the next time is sort of a strange feeling. It’s like saying goodbye to a dear friend that is leaving for a while, and although you have some very good memories, it leaves you with a sort of funny feeling.
So, for now I want to thank all of the guys that made this week an incredible experience. Folks like Aaron, Erin, Nick, Dave, Rick, Travers, Fred, Tommy, JT and Heather. And of course all of the others that I won’t all name here right now. Thanks guys!
So, since there will be two big events going on in Las Vegas (EMC World and Interop), it looks like we will have a lot of folks in one spot that have a passion about storage, virtualization and all things IT.
This led me to the initiative to set up a get together to have some drinks, food, and just talk and geek out. Most of these folks on Twitter have heard about storagebeers and vBeers, and now we are trying to set one up in Las Vegas. Currently this is based on a PYOB (Pay Your Own Bill) model, but I will try to see if we can get some vendors to perhaps give out a round.
Since not everyone has the same agenda, I’ve created this list to try and find out when we can get the most people together. Feel free to enter the date where you think you would be available, and I’ll see to it that a mail gets sent out with the final date. Also, if you know a good spot to have these drinks, feel free to add them to the location field, or just e-mail me at bas.raayman (AT) emc.com. I look forward to seeing you all (again)!
Oh, and just as a short disclaimer, I won’t sell your e-mail adresses, abuse them for spam or anything alike. It’s just to set up this event and send you an update on it, and that’s it. After the event, all mail addresses will be deleted.
Updates:
Update: 1 round is on Storage Staffing, thanks!
Update 2: Date is set to May 10th, 20:00 (Pacific) at Harrah’s Piano in Las Vegas. A seperate update via e-mail will be sent out to the folks who joined the online meeting. Feel free to join us if you haven’t registered here ! 🙂
EMC EBC CorkI spent some days in Cork, Ireland this week presenting to a customer. Besides the fact that I’m now almost two months in to my new job, and I’m loving every part of it, there is one part that is extremely cool about my job.
I get to talk to customers about very cool and new technology that can help them get their job done! And while it’s in the heart of every techno loving geek to get caught up in bits and bytes, I’ve noticed one thing very quickly. The technology is usually not the part that is limiting the customer from doing new things.
Everybody know about that last part. Sometimes you will actually run in to a problem, where some new piece of kit is wreaking havoc and we can’t seem to put our finger on what the problem is. But most of the time, we get caught up in entirely different problems altogether. Things like processes, certifications (think of ISO, SOX, ITIL), compliance, security or just something “simple” as people who don’t want to learn something new or feel threatened about their role that might be changing.
And this is where technology comes in again. I had the ability to talk about several things to this customer, but one of the key points was that technology should help make my life easier. One of the cool new things that will actually help me in that area was a topic that was part of my presentation.
Some of the VMware admins already know about this technology, and I would say that most of the folks that read blogs have already heard about it in some form. But when talking to people at conventions or in customer briefings, I get to introduce folks over and over to a new technology called VAAI (vStorage API for Array Integration), and I want to explain again in this blog post what it is, and how it might be able to help you.
So where does it come from?
Well, you might think that it is something new. And you would be wrong. VAAI was introduced as a part of the vStorage API during VMworld 2008, even though the release of the VAAI functionality to the customers was part of the vSphere 4.1 update (4.1 Enterprise and Enterprise Plus). But VAAI isn’t the entire vStorage API, since that consists of a family of APIs:
vStorage API for Site Recovery Manager
vStorage API for Data Protection
vStorage API for Multipathing
vStorage API for Array Integration
Now, the “only API” that was added with the update from vSphere 4.0 to vSphere 4.1 was the last API, called VAAI. I haven’t seen any of the roadmaps yet that contain more info about future vStorage APIs, but personally I would expect to see even more functionality coming in the future.
And how does VAAI make my life easier?
If you read back a couple of lines, you will notice that I said that technology should make my life easier. Well, with VAAI this is actually the case. Basically what VAAI allows you to do is offload operations on data to something that was made to do just that: the array. And it does that at the ESX storage stack.
As an admin, you don’t want your ESX(i) machines to be busy copying blocks or creating clones. You don’t want your network being clogged up with storage vMotion traffic. You want your host to be busy with compute operations and with the management of your memory, and that’s about it. You want as much reserve as you can on your machine, because that allows you to leverage virtualization more effectively!
So, this is where VAAI comes in. Using the API that was created by VMware, you can now use a set of SCSI commands:
ATS: This command helps you out with hardware assisted locking, meaning that you don’t have to lock an entire LUN anymore but can now just lock the blocks that are allocated to the VMDK. This can be of benefit, for example when you have multiple machines on the same datastore and would like to create a clone.
XSET: This one is also called “full copy” and is used to copy data and/or create clones, avoiding that all data is sent back and forth to your host. After all, why would your host need the data if everything is stored on the array already?
WRITE-SAME: This is one that is also know as “bulk zero” and will come in handy when you create the VM. The array takes care of writing zeroes on your thin and thick VMDKs, and helps out at creation time for eager zeroed thick (EZT) guests.
Sounds great, but how do I notice this in reality?
Well, I’ve seen several scenarios where for example during a storage vMotion, you would see a reduction in CPU utilization of 20% or even more. In the other scenarios, you normally should also see a reduction in the time it takes to complete an operation, and the resources that are allocated to perform such an operation (usually CPU).
Does that mean that VAAI always reduces my CPU usage? Well, in a sense: yes. You won’t always notice a CPU reduction, but one of the key criteria is that with VAAI enabled, all of the SCSI operations mentioned above should always perform faster then without VAAI enabled. That means that even when you don’t see a reduction in CPU usage (which is normally the case), you will see that since the operations are faster, you get your CPU power back more quickly.
Ok, so what do I need, how do I enable it, and what are the caveats?
Let’s start off with the caveats, because some of these are easy to overlook:
The source and destination VMFS volumes have different block sizes
The source file type is RDM and the destination file type is non-RDM (regular file)
The source VMDK type is eagerzeroedthick and the destination VMDK type is thin
The source or destination VMDK is any sort of sparse or hosted format
The logical address and/or transfer length in the requested operation are not aligned to the minimum alignment required by the storage device (all datastores created with the vSphere Client are aligned automatically)
The VMFS has multiple LUNs/extents and they are all on different arrays
Or short and simple: “Make sure your source and target are the same”.
Key criteria to use VAAI are the use of vSphere 4.1 and an array that supports VAAI. If you have those two prerequisites set up you should be set to go. And if you want to be certain you are leveraging VAAI, check these things:
In the vSphere Client inventory panel, select the host
Click the Configuration tab, and click Advanced Settings under Software
Check that these options are set to 1 (enabled):
DataMover/HardwareAcceleratedMove
DataMover/HardwareAcceleratedInit
VMFS3/HardwareAcceleratedLocking
Note that these are enabled by default. And if you need more info, please make sure that you check out the following VMware knowledge base article: >1021976.
Also, one last word on this. I really feel that this is a technology that will make your life as a VMware admin easier, so talk to your storage admins (if that person isn’t you in the first case) or your storage vendor and ask if their arrays support VAAI. If not, ask them when they will support it. Not because it’s cool technology, but because it’s cool technology that makes your job easier.
And, if you have any questions or comments, please hit me up in the remarks. I would love to see your opinions on this.
Update: 2010-11-30
VMware guru and Yellow Bricks mastermind Duncan Epping was kind enough to point me to a post of his from earlier this week, that went in to more detail on some of the upcoming features. Make sure you check it out right here.
A little while back, EMC released a new version of it’s CLARiiON Fibre Logic Array Runtime Environment, or in short “FLARE” operating environment. This release brings us to version 04.30 and again has some enhancements that might interest you, so once more here’s a short overview of what this update packs:
Let’s start off with some basics. Along with this update you will find updated firmware versions for the following:
With version 04.30.000.5.507 you get support for FCoE. Prerequisite is using a 10 Gigabit Ethernet I/O module on CX4-120, CX4-240, CX4-480, and CX4-960 arrays.
SATA EFD support.
Following that point, you can now use Fibre Channel EFD and SATA EFD in the same DAE.
And, you can now also mix Fibre Channel and SATA EFDs in the same RAID group.
VMware vStorage API support in form of “vStorage full copy acceleration” (basically the array takes care of copying all the blocks, instead of sending everything to and from the application) and in form of “Compare and Swap” (an enhancement to the LUN locking mechanism).
Rebuild avoidance. This feature will change the routing of I/O to the service processor that still has access to all the drives in the RAID group. You do need write caching to be enabled if you want to be able to use this feature.
Virtual provisioning, basically EMC’s name for thin provisioning on the array.
There are some nice features in there, but for me personally the virtual provisioning, the FCoE support and the vStorage API support are the main ones.
One thing that caught my eye was in the section called limitations for FLARE version 04.30.000.5.507. In the release notes you will find the following statement:
Host attach support – Supported host attached systems are limited to the following operating systems: Windows, VMWare, and Linux
Which would mean that you have a problem when you are using something else like Solaris or HP-UX. I’m trying to get some confirmation, and I’ll update this post as soon as I have more info.
Update
The statement has changed in the meantime:
Host attach support – Supported hosts that can be attached over an FCoE connection are limited to the following operating systems: Windows, VMWare, and Linux
Which means that this is just related to FCoE connected hosts.
After some feedback on Twitter from among others Andrew Sharrock, I’d thought it might be wise to talk a few sentences about the Virtual Provisioning feature.
In short, Virtual Provisioning was already introduced with FLARE 28. Problem was that at the time, you could only use the feature with thin pools. Basically, with this update, you also get support for a newer version of the feature. Things that were added are:
Thick LUNs
LUN expand and shrink
Tiering preference (storage allocation from pools with mixed drives and different performance characteristics)