EMC, VMUG, VMware

VMUG for Germany west (Schwalbach am Taunus)

Just a small reminder for the people that live in my area. On Friday, June 7th, the German VMware User Group (VMUG) west will be meeting up at the EMC office in Schwalbach (click here for a PDF with the address and route). In case you don’t know what the VMUG is for, here’s a quick summary:

The VMware User Group (VMUG) is an independent, global, customer-led organization, created to maximize members’ use of VMware and partner solutions through knowledge sharing, training, collaboration, and events.

The beauty of it? It’s something set up by users for other users. That means that people come to these events to get information that is vendor neutral, and have the ability to talk freely to others without having to fear that someone is trying to only give them the marketing pitch. Or at least, that is what it should be like.

So, the Germany West VMUG Meeting is at Friday, June 7, 2013 at the following address and time:

09:30 – 16:15

EMC Deutschland GmbH
Am Kronberger Hang 2a
65824 Schwalbach/Taunus

You can use this link to register for the event, free of charge, and get to see talks on VMware Nicira, “VMware Network & Security” and other security related topics.

And one important thing to note. The VMUG is a community set up by VMware customers for VMware customers. To exchange ideas, exchange common issues or worries, learn and get to know others in the community. If you feel like you can contribute, submit a proposal for a talk, or suggest a topic for the next VMUG. The more people that participate, the better a VMUG gets!

I’ll be there, and I’m looking forward to seeing you there!

General

Time for a change: Keep calm and…

Keep calm and join Nutanix.

Keep calm and join Nutanix - Picture by Christian Mohn
Keep calm and join Nutanix – Picture by Christian Mohn
Yep, no sense in beating around the bush. I resigned with EMC, and after wrapping up open topics, I will be starting as the first German systems engineer for Nutanix on June 17th.

I’ve learned incredibly much at EMC. After joining EMC in 2010, I was lucky to be part of a team that has done some incredible things. I feel like the vSpecialist team set a bar on how customer interaction can work, how a team of great individuals can combine in to something much more, and transform the way a company goes about. I learned ways to present information (hopefully in an interesting way), made friends, was able to help customers, worked on several certifications, and always had the feeling that I was still the dumbest guy on the team. I loved the fact that I was able to still ask tough questions internally, without being viewed as “that guy that just sits around nagging”. I’ve got so much to be grateful for, and I am. People like Chad Sakac or Wade O’Harrow who saw some potential in me, or someone like Holger Daube who has been a better boss to me than I could wish for. There are too many to name and thank individually, but thank you to all of you!

But I am moving on. After talking to several people, and discussing, reading things like this, I can’t help but feel that this is a great chance. I can try to set up something new, help define solutions, and get to see what it looks like working in a smaller company, with what I’m expecting to be an even crazier pace.

So, here’s to seeing you on the flip-side, and having fun with something new! 🙂

Clustering, EMC, Storage, Virtualization, VMware, VPLEX, vSphere

VMware HA demo using vMSC with EMC VPLEX Metro

That’s a mouth full of abbreviations for a title, isn’t it?

So, let me give you some background info. VMware introduced something called the vSphere Metro Storage Cluster, and Duncan Epping talks about this feature here.

What the vMSC allows us to do, is to create a regular stretched vSphere cluster, but now also stretch out the storage between the two clusters. This can be done in two ways (to quote from Duncan’s article):

I want to briefly explain the concept of a metro / stretched cluster, which can be carved up in to two different type of solutions. The first solution is where a synchronous copy of your datastore is available on the other site, this mirror copy will be read-only. In other words there is a read-write copy in Datacenter-A and a read-only copy in Datacenter-B. This means that your VMs in Datacenter-B located on this datastore will do I/O on Datacenter-A since the read-write copy of the datastore is in Datacenter-A. The second solution is which EMC calls “write anywhere”. In this case VMs always write locally. The key point here is that each of the LUNs / datastores has a “preferred site” defined, this is also sometimes referred to as “site bias”. In other words, if anything happens to the link in between then the storage system on the preferred site for a given datastore will be the only one left who can read-write access it.

The last scenario described here is something that obviously can cause some issues. EMC tried to address this by introducing the “independent 3rd party”, in form of the VPLEX Witness. Some documentation states that this witness should run in a 3rd site, but I would recommend to run this in a separate failure domain.

In essence, we have created the following setup:

© VMware

Awesome stuff, because we can do new things that weren’t quite possible before. Since VPLEX is one of the key storage virtualization solutions from EMC that allow us to perform an active/active disk access, we can perform a vMotion between the two sites, and due to the nature of VPLEX, we also perform a sort of storage vMotion on the underlying disks. That, without you having to shut down the VM to do both things at the same time. Pretty neat!

Now, as Chad describes here, a new disk connectivity state was introduced with vSphere 5, called “Permanent Device Loss” or PDL. This was a great feature to communicate to your infrastructure that a target was intentionally removed. You could unmount the disk, and remove the paths to your target in a proper way.

It was also useful to indicate an unexpected loss of your target, indicating that your cluster is in a partitioned state. The problem here was that a PDL state and VMware HA didn’t work so well together. When you had an APD notification, HA didn’t “kill” your VM, and your virtual machine would usually continue to respond to pings, but that was about it.

Then along came vSphere 5 Update 1, which allows us to set a flag on each of the hosts inside our cluster, and set a different flag for our HA cluster. Now, we can actually use HA and see terminate the VMs and have it restart the virtual machines on the hosts in our cluster that still have access to their datastores in their respective preferred sites.

I’ve created a short (ok, 8 minutes) video that show exactly this scenario. You’ll get a quick view of the VPLEX setup. You’ll see the Brocade switches that will change from a config with the normal full zoneset, being switched to a zoneset that will disable the inter-switch links between both VPLEX clusters. And you’ll see the settings inside of my vSphere lab setup, with the behavior of the hosts and virtual machines.

Since I’m quite new to creating videos like this, I hope the output is acceptable, and the video is clear enough. If you have any questions, feedback or would like to see more, please leave me a comment and I’ll see what I can do. 🙂


Just a quick modification to my post, since it wasn’t actually VM-HA (or VM monitoring) responding to the PDL event, but HA terminating the VM when running in to the PDL state, as Duncan pointed out to me on Twitter. Sorry for any confusion I may have caused!

Isilon, SRM, Storage, Virtualization, VMware

Problem with the EMC Isilon Storage Replication Adapter

VMware vCenter SRMA lot of folks out there use the VMware vCenter SRM to create and manage disaster recovery scenarios for their virtualized environments.

Besides having a button to click to fail over (parts of) your environment to a different site, it has one benefit: It forces you to think about your systems. You need to consider which systems are vital to your infrastructure, and you need to be aware of dependencies that you may have in your environment. There are numerous other things that SRM can help with, but that’s not what I wanted to highlight here.

A couple of days ago, I was at the VMware office in Munich, and was helping setting up a SRM 5.0 demo that would serve as a hands-on lab for people interested in SRM. The base of this SRM installation is a virtualized Isilon cluster, that offers the ability to easily provision storage, and offers replication between sites (a quick video overview by my colleague Nick Weaver can be found ).

While setting up the Isilon SRA which you can download from the VMware website, I ran in to a problem. When you download and extract the actual SRA, you’ll get a bunch of PDF files, and two executables. One is the installer for the actual storage replication adapter. It’s called “EMCIsilonSRASetup_1_0.exe”, and you need a current Java development kit to get that one running, but it should install correctly.

The second file is called “IsilonReplicationHelperSetup.exe”, and this is used to configure the SRA before using it in SRM. Now, when starting this helper, both me and Jase McCarty have seen errors that refer to a missing Java class (com.izforge.izpack.installer.Installer), for a program called IzPack which was used to create the installer. After extracting the actual executable, it seemed like some classes/libraries were missing from it.

I’ve been in touch with Isilon support after running in to the error, and after checking with them, they gave me an MD5 hash of a working copy of the IsilonReplicationHelperSetup.exe, which is:
416535bc1c7d7f133037af04b5502e3b However, MD5 for the executable that I got was:
4342E880A99EE2ED6DA1205F1018233DWhich obviously is different. The MD5 of the downloaded file, and the MD5 that VMware shows for the actual zip that contains the SRA matched up though.

So, I’m putting this post out there as a word of warning. It seems like one of the Isilon SRA files on the VMware website is non-functional. Should anybody out there see this, make sure to contact Isilon support and reference case 00169080, which is my case number.

I’m still working with the Isilon support to see what the next steps are going to be, and I’m sure this is going to be resolved soon enough, but I wanted to put this information out there for you in the meantime, to avoid people having to go through the same process as I did. It might save some folks a bit of time. And I’ll make sure I update this post when I get a solution from the Isilon support team.

Update – January 16th 2012:

While I’m still working with the Isilon support group to get everything sorted out, I did get a version of the IsilonReplicationHelperSetup Java archive that seems to be working. Now, I’m sharing this with you all while we try to get things resolved, and to get the working download on the VMware site, but I need to add a large disclaimer:


This file is not officially supported by EMC and/or Isilon, and while this file worked for me, your mileage may vary, and I would recommend that you do not use this file in a production environment! The file might work in a test environment, but please refrain from using it in a productive environment. Use the official files from the VMware download site, or create a case with Isilon and/or VMware support!


Now, to help you verify this file, the MD5 for the Java archive is:
FFAC907E70FD0BFC73076793B9D5FCB4and you can get the file here.


Update – February 10th 2012:

VMware has updated the Isilon SRA file, and the new MD5 for Version 1.0, (released 01/18/2012) currently is:d8b8408ab259d64ee3f5a83486e2a25eThis actually contains the working files, so you should be all set. 🙂

EMC, Storage, V-MAX, Virtualization, VMware

VMAX VSA: IT’S ALIVE!!!!!!!!!!!!!!!!

So folks, here’s a shameless copy of a blog post from one of the guys on my team. Dave was just brilliant and actually created a virtual storage appliance of the EMC VMAX. I think that’s downright awesome, and I wanted to help him get attention for what he did, so I asked him if I could copy his blog post, which is what you will find here:

young_frankenstein_doc_small

 

As the title suggests there is indeed a Symmetrix VMAX VSA. I have been working on this project since shortly after EMC World. As I look back through my emails, I received the code on 6/3/11 and I have been working on it in almost all of my free time since then.

Now finally it will make its public debut this week at VMworld 2011 as part of the EMC Interactive Demo booth on the show floor. As part of its grand unveiling I thought I would tell you a little about what makes it work.

Now to make a few things clear up front, this is a science project, I cannot distribute it, it does “work”. As part of the lab (I will publish the guide) the student actually provisions an iSCSI disk from the VSA to a ESXi 5.0 host.

One of the first things I noticed with the code when trying to virtualize it. It’s HUGE. There are 2 parts to the VSA.

1. The Service Processor (SP). In a physical VMAX this is the 1U server that is racked in the system bay. It has a special image of Windows XP and contains all of the proprietary software used to manage a VMAX. If you own a VMAX this is what you will see EMC field service personnel using when they come to work on your system. This is NOT accessible by a end-user as it requires special RSA credentials that change weekly. (one reason we can’t distribute it). Its specs are 2vCPU and 2GB of RAM and about 10GB of disk space.

2. Enginuity. This is the Operating Environment of the Symmetrix. For the purposes of this VSA it runs in a SuSE Enterprise Linux 11VM. One of the big deals with the VMAX was that Enginuity was ported from a PowerPC CPU to a Intel x86 based architecture. Without this change this VSA would never exist. Now this VM is big, so big as a matter of fact i had to use a RC build of vSphere 5 in order to even get it to work. I was finally able to scale it down a bit, but at one point it was using 32 vCPU’s 92GB of RAM and about 250GB of disk space.

Obviously one of the challenges for using this in a lab is that I needed it to use fewer resources. In the beginning this VMAX was a Single Engine model, which means it had 16 “slices” running. Each director has 4 DA (backend) directors, and 4 FA (front end) directors. I quickly found this was the biggest reason i needed so much memory and CPU. After working with one developer Chakib, who totally rocks by the way. We were able to scale this down to 1 FA and 1 DA per director. One interesting side note, when I was going down this path I asked Chakib what kind of VM he was using to test this. His reply was, “I am not using this in a VM, I have a physical Linux box with 200GB of RAM”. So I clearly had some work to do. But in its current state it uses 8 vCPU and “ONLY” 48GB of RAM. Which is still pretty darn big, but a lot better than it was when we started.

The networking requirements are pretty simple, the SP needs 1 Public NIC so that we can use its management tools. 2 Internal NICs which is used for internal communication to the directors. In our case that’s the Linux VM. The Linux VM needed the 2 internal NICs and 1 NIC to present an iSCSI target to. Then we put out ESXi host’s VMkernel NIC on the same vSwitch so it can use the iSCSI target provided by the VSA.

So that’s all great you say, but what actually works? That’s a good question.

What works is using Standard Devices, and very small ones today. One of the things I was told when I was given the code was that this WON’T and CAN’T do any I/O. Which obviously proved to be a bit of an issue. Chakib really worked his butt of to get me something that does I/O. So this is not like the Celerra UBER VSA by @lynxbat, where you can run a VM off of it. We hope we can do that one day. Thin Pools work to the extent you can create them, and put devices in a pool, but when you present it to a host it will not work. This kept me from using the VSI SPM plugin for vSphere as part of my lab, hey we always have next year! The really neat part to me is that the internal tools (SymmWin) that run on the SP fully work. It’s like having an actually VMAX, but without all the fuss of getting a few 50A power drops. As an ex-customer this to me is the coolest part, I got to put on my own BIN files, use Inlines (internal tool used to directly talk to the hardware). As a total nerd this thing is a dream come true.

So what’s next?

Well a lot of that depends on YOU! Since this is a total science project we need to show those in Symmetrix Engineering this is worth putting their time and money into. I need everyone here at VMworld this week to come try this thing, give me feedback, leave comments here, and if you aren’t at the show, express your desire for us to continue working on it. If no one is interested this will ultimately die on the vine. Please fill out this form so we can show how many of you all would like to see this project continue.

I have to give special thanks to Chad Sakac (@sakacc), Chris Horn (@horn_Chris) for getting me involved in this project and letting me run with it. Also all of the support they gave me during this process.

Here is a link to the lab guide being used this week at VMworld. Take a look and let me know what you think!

VMAX Lab Guide

Big thanks to Matt Cowger (@mcowger), Scott Lowe (@scott_lowe), and Tee Glasgow (@teeglasgow) for their help with the lab guide. Also to Rick Scherer (@rick_vmwaretips) for the blog help