Kubernetes, Nutanix, Virtualization

[Shorts] Connecting to Nutanix Karbon using Windows

Nutanix has released a product called “Karbon” which according to the website allows you to:

Get a production-ready Kubernetes Cluster up and running in 20 minutes.

Now, I’ll leave it up to the marketing people to verify if it takes 20 minutes or not. But once you have your cluster up and running, our developers assumed that anyone connecting to the Karbon cluster will use a Linux based operating system.

Since that isn’t the case for everyone, and I like to try other things, I wanted to see if I can connect to the Karbon cluster using Microsoft Windows as the operating system. This works fine, but there will be a couple of things you need to do.

First, log on to your Karbon cluster, select the cluster that you want to connect to, and from the “actions” menu at the top select the “SSH Access option”, which will give you a dialog window with a download link:

Screenshot 2019-08-07 at 15.19.08

Download the file in a location of your choice. Next, open the file, but open it in an editor that is capable of properly handling files created on Linux. This is important since there is a difference in the way a press of the “Enter” key is written in text files between the operating systems.

Once the file is opened, you will see two larger blocks of information. One part can be found in the “private_key” part of the file, and one in the “user_cert” part of the file. First off, select all of the lines after:

private_key='

So essentially you are copying everything from (and including):

-----BEGIN RSA PRIVATE KEY-----

down to (and including):

-----BEGIN RSA PRIVATE KEY-----

But make sure you do not include the single quote at the start and the end. Take this and save it as a new file, for example called “karbon-user”

Then, do the same thing for everything following:

user_cert='

And copy everything starting with “ssh-rsa-cert-v01” down to the last character of that long string, but do not include the final single quote. Save that as a file for example called  “karbon-user-cert.pub”

Next up, start PowerShell on your Windows system as an administrator and start the ssh-agent service:

Start-Service ssh-agent

And if you want to have it start automatically:

Set-Service ssh-agent -StartupType Automatic

Then add your users private key file that you save before:

ssh-add C:\Users\Bas\Downloads\karbon-user

Obviously replace the path above to the location on your system 😉

Next, the only thing you have to do is ssh into the IP address of your Karbon VM while passing the public key we also saved before:

ssh -i C:\Users\Bas\Downloads\karbon-user-cert.pub nutanix@192.168.0.5

Again, replace the path to the file, and replace the IP to your VMs IP. But once that is going, you can connect to your Karbon VMs and work with them without having to set up a Linux system. 🙂

Screenshot 2019-08-07 at 15.39.54

Nutanix, Performance, SAP, Uncategorized

RDMA on Nutanix AHV and the discomfort of heterogeneous environments

In the process of setting up our new environment for SAP HANA validation work, I spent some time in the data center setting up our environment, and I ran into some caveats which I figured I would share.

To set the stage, I am working with a Lenovo HX Nutanix cluster. The cluster consists of two HX-7820 appliances with 4x Intel 8180M CPU’s, 3TB RAM, NVMe, SSD and among other things two Mellanox CX-4 dual port NICs. The other two appliances are two HX-7821 with pretty much the same configuration except these systems have 6TB of RAM. The idea is to give this cluster as much performance as we can and to do that we decided to switch on Remote Direct Memory Access, also called RDMA in short.

Now, switching on RDMA isn’t that hard. Nutanix has added support for RDMA with AOS version 5.5, and according to our “one-click” mantra, it is as simple as going into our Prism web interface, clicking the gear symbol, going to “Network Configuration” and from the “Internal Interface” tab enable RDMA and put in the info about the subnet and VLAN you want to use as well as the priority number. On the switch side, you don’t need anything extremely complicated. On our Mellanox switch we did the following (note that you’d normally need to disable flow control on each port, but this is the default on Mellanox switches):

interface vlan 4000
dcb priority-flow-control enable force
dcb priority-flow-control priority 3 enable
interface ethernet 1/29/1 dcb priority-flow-control mode on force
interface ethernet 1/29/2 dcb priority-flow-control mode on force
interface ethernet 1/29/3 dcb priority-flow-control mode on force
interface ethernet 1/29/4 dcb priority-flow-control mode on force

With all of that in place, you would normally expect to see a small progress bar and that is it. RDMA set up and working.

Except that it wasn’t quite as easy in our scenario…

You see, one of the current caveats is that when you image a Nutanix host with AHV, we pass through the entire PCI device, in this case the NIC, to the controller VM (cVM). The benefit is that the cVM now has exclusive access to the PCI device. The issue that arises is that we currently do not forward a single port, which isn’t ideal in the case of a NIC that has multiple ports. Add on top of that the fact that we don’t give you the choice which port to use for RDMA, and the situation becomes slightly muddied.

So, first off. We essentially do nothing more than see if we have an RDMA capable NIC, and we pass through the first one that find during the imaging process. In a normal situation, this will always the RDMA capable NIC on the PCI-slot with the lowest slot number. It will also normally be the first NIC port that we find. Meaning that if you have for example a non-RDMA capable Intel NIC in PCI slot 4, and two dual port RDMA capable cards in slot 5 and 6, your designated RDMA interface is going to be the first port on the interface in slot 5.

Since you might want to see what MAC-address is being used, you can check from the cVM by running the ifconfig command against the rdma0 interface. Note that this interface by default will exist, but isn’t online, so it will not show up if you just run an ifconfig command without parameters:

ifconfig rdma0
rdma0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::ee0d:9aff:fed9:1322  prefixlen 64  scopeid 0x20
        ether ec:0d:9a:d9:13:22  txqueuelen 1000  (Ethernet)
        RX packets 71951  bytes 6204115 (5.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 477  bytes 79033 (77.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

To double check if you have the correct interface connected to your switch ports my tip would be to access the lights out management interface (IMM / ILO/ iDRAC, IPMI, etc.) and check your PCI devices from there. Usually these will tell you the MAC of the interface for the various PCI devices. Make sure you double check if you are connected to the right physical NIC and switch port.

The next topic that might come up is the fact that we will automatically disable c-states on the AHV host in the process of enabling RDMA. This is all done in the background, and again normally will be done automatically. In our case, since we added a couple of new nodes to the cluster, the BIOS settings were not the same across the cluster. The result of that was that on the AHV hosts, the HX-7820 nodes had the following file available that contained a value of “1”:

/sys/devices/system/cpu/cpu*/cpuidle/state[3-4]/disable

Due to the BIOS settings that were different on the NX-7821 hosts that we added, this file and the cpuidle (sub-)directories didn’t exist on the host. While the RDMA script tried to disable c-states 3 and 4 on the hosts, this was only successful on two out of the four nodes in the cluster. Upon comparing the BIOS settings we noticed some deviations in available settings due to differences in versions, and differences in some of the settings as they were delivered to us (MWAIT for example). After modifying the settings to match the other systems, the directories were now available and we could apply the c-states to all systems.

While we obviously have some work to do to add some more resiliency and flexibility to the way we enable RDMA, and it doesn’t hurt to have an operational procedure to ensure settings are the same on all systems before going online with them, I just want to emphasize one thing:

One click on the Nutanix platform works beautifully when all systems are the same.

There are however quite a couple of caveats that come into play when you work with a heterogeneous environment/setup:

  • Double check your settings at the BIOS level. Make them uniform as much as you can, but be aware of the fact that sometimes certain settings or options might not even be available or configurable anymore.
  • Plan your physical layout. Try not to mix a different number of adapters per host.
  • Create a physical design that can assist the people cabling with what to plug where to ensure consistency.
  • You can’t always avoid making changes to a production system, but if at all possible, have a similar smaller cluster for the purpose of quality assurance.
  • If you are working in a setup with a variety of systems things will hopefully work as designed but might not. Log tickets where possible, and provide info that goes a bit further than “it doesn’t work”. 😉

Oh, and one more thing. Plan extra time. The “quick” change of cables and enabling of RDMA ended up in spending 4 hours in the data center working through all of this. And that is with myself being pretty familiar with all of this. If you are new to this, again if at all possible, take your time to work through this, versus doing this on the fly and running into issues when you are supposed to be going live. 🙂

Nutanix, Nutanix AHV, SAP, Virtualization

Bas, what the hell have you been doing? – an​ SAP HANA on Nutanix story

To say my last blog post is “a while ago” would a grave understatement. Unfortunately, I’ve mainly been busy with something that was entirely new for Nutanix, and with the amount of work involved and the sensitive nature of what I was working on, there was relatively little room left to blog. Especially since I usually ended up blogging about stuff, I stumbled upon while doing my job or researching.

This all started with me changing from my presales focussed role to our internal “Solutions & Performance Engineering” team, which focusses on the business-critical applications running on the Nutanix platform. In essence, those applications that are the lifeblood of a company. Applications which, if they are unavailable, will cost the company significant amounts of money.

One of those applications is SAP, or more specifically the SAP HANA in-memory database. My colleagues (mainly Kasim Hansia, Rainer Wacker and Alexander Thoma) had already been doing a great job, and all of the SAP applications were certified to run on the Nutanix platform in November 2016. The big question was always “When can we run SAP HANA on Nutanix?”.

Working on the answer to this question is what I’ve been busy with the last year or so. While I won’t bother you with the specific details on the business side of things, I do want to take a bit of time to show what it’s like to go through the process of validating a new application.

First off, the SAP HANA in-memory database is an application that scales to levels that many people won’t ever see in action. You can run HANA in two ways. You either scale up the resources of a single server, for example running with up to 20TB of memory, or you can scale out by adding multiple servers and distributing the load across all servers.

Now, SAP has given the customer two options to select hardware to run SAP HANA. One is an “appliance model” where you choose a configuration as a whole, and everything will run in a pre-tested and validated fashion. You are ensured of a specific behavior of the whole system while running your application. The other option is something called “Tailored Data Center Integration” or TDI in short, where in essence you select your hardware from a hardware compatibility list and have the freedom to mix and match.

What we have done is work with SAP to introduce a new third category called “Hyperconverged Infrastructure” or HCI. The HCI category assumes that we are running SAP HANA in a virtualized fashion, and “collapse” several infrastructure components such as compute and storage to an integrated system.

The limitations on the maximum sizes for this category are smaller than for the other two categories, but the requirements that are in place for this certification do not offer much more leeway. For example, a storage test to ensure storage performance, where initially log overwrite operations needed to have latency <= 400 microseconds (this changed later on). Another example is a test suite of close to 700 tests that emulate real-world issues, and the performance delta is then compared to a bare-metal installation, with only a specific maximum performance delta between the two giving you a passing grade.

All this meant that I had my work cut out for me from the start. We started off working with a server model that wasn’t qualified before, before switching to the validation hardware, namely a Lenovo SR950. A big four-socket server with the fastest CPUs we could use, namely the Intel Xeon Platinum 8180M Processors, 3072 GB of RAM, 3.84TB SSDs and 1.6 TB NVMes.

Now, as much as Nutanix is a software company, we do strictly check that hardware meets specific prerequisites to ensure a smooth user experience and to make sure that certain performance metrics are a given. The issue is that all of the checks and functionality in place didn’t work for this new hardware. Simple things like making the status indicator LED for the NVMe light up, or mapping the physical drive locations back to the diagram view in Prism. It meant modifying Python files that handle how hardware is accessed, packaging everything back up into Python egg files, restarting services and then magically seeing drives that the system was able to access. Or passing through NICs so that we could test with “RDMA over Converged Ethernet” or RoCE, and changing BIOS settings to ensure maximum performance.

And while pushing the underlying hardware to its limits, it also meant we had to dive deep, and I mean very deep, into the software side of things. From things like experimenting with c-states on the CPU and NIC multiqueueing in the virtual machines, down to changing parameters in our product, ensuring that specific CPU features are passed through, the pinning of virtual CPUs to their physical location or making changes to how often a vmexit instruction is called.

I can’t go into all of the specific details since some of it is Nutanix’ intellectual property, but I’ll check what I can share in future posts, and if you have any specific questions, please ask them, and I’ll try to answer as best I can. What I can say is that we pushed the limits of our platform and quite a couple of the things we found are going to be implemented back into the product, and I expect a lot of those changes to show up in the release notes of an upcoming release.

Fact is, I learned a ton of new things, and this all culminated in our validation for pre-production use of SAP HANA on Nutanix AHV as announced in https://www.nutanix.com/2018/06/05/finally-can-talk-sap-hana-nutanix/, and we are working full steam ahead on the last steps to get production support. It was and continues to be one hell of a journey, and I just wanted to give you guys a bit of insight into what it is like working on the engineering side of the Nutanix platform, and what a project like this entails.

I want to finish with a special thank you to (in no particular order), Rainer, Alexander, Kasim, Malcolm, Jonathan, and the extended team. It’s been one heck of an experience! 🙂

Uncategorized

Using a noVNC branch to connect to your Supermicro iKVM

Do you know how everyone just loves client side Java? Yeah, exactly. That’s why a lot of Nutanix customers will be quite happy with the IPMI firmware update for our latest systems. You will no longer have to rely on java to use the lights-out management, but you can simply use HTML5 to manage your systems.

Unfortunately, this update isn’t available to older systems. After cursing at Java again in my home lab, I decided to see if there was no way around it. Fortunately, I noticed that on GitHub, a developer called “kelleyk” posted a port of noVNC that adds support for ATEN iKVM, which is used in quite a couple of Supermicro servers.

So, after a bit of fiddling, I managed to get this to work on my Mac. First things first, download the fork from here: https://github.com/kelleyk/noVNC

Next up, make sure you have the xcode command line tools installed. If not, just run:

xcode-select --install

Also, since we will need a web socket bridge to forward requests to your IPMI interface (which encrypts traffic), make sure you have a certificate named self.pem, or generate one by issuing:

openssl req -new -x509 -days 365 -nodes -out self.pem -keyout self.pem

Once that is done, the rest is simple. Open a terminal and go to the directory where you copied the noVNC fork, and go to the “utils” directory. From there, run the launch.sh shell script, and provide it with the IP-address of your IPMI interface and use the default VNC port as the port number:

./launch.sh --vnc 192.168.10.10:5900

This will launch the script, and give you a link that you can open in your browser:

Using local websockify at ~/noVNC-bmc-support/utils/websockify/run
Starting webserver and WebSockets proxy on port 6080
WebSocket server settings:
 - Listen on :6080
 - Flash security policy server
 - Web server. Web root: /Users/basraayman/Downloads/noVNC-bmc-support
 - SSL/TLS support
 - proxying from :6080 to 192.168.10.10:5900

Navigate to this URL:

 http://Bass-MacBook-Pro-Retina.local:6080/vnc.html?host=Bass-MacBook-Pro-Retina.local&port=6080

Follow the link in your browser, leave the values as they are, and in the password field, input your IPMI username and password separated by a colon, so in the following format (note that ADMIN is both the default username and password in this case):

ADMIN:ADMIN

screen-shot-2016-12-03-at-15-56-09

Once that is done, click the connect button, and you are now able to connect to the lights-out interface using your browser, no java needed. And while this might not be super ideal (no forwarding of iso images and such), it should make day to day administration a bit easier.

screen-shot-2016-12-03-at-16-00-55

So, give it a whirl, and let me know if this works for you in the comments. 🙂

Uncategorized

[Shorts] Using the Nutanix Docker Machine Driver

Nutanix announced version 4.7 of its AOS, basically the operating system for the controller VMs. One of the things that is new, are the so called Acropolis Container Services. Basically allowing the Nutanix cluster to act as in such a way that you can for example use the docker-machine command (with a corresponding driver) to create new Docker instances that deploy automatically on your cluster.

While we are still waiting for the Docker Machine Driver to be posted to our portal, I’ve dug up an internal version of the driver, and decided to test it and post my experiences.

Right now it’s relatively simple to get started. You need your Nutanix cluster running AHV and some working credentials for the cluster. Also, you need a system that has Docker machine installed.

Once you have that, download the Docker machine driver for your platform from Nutanix (once it is posted that is). We should be offering a driver for OSX, Windows and Linux, so you should pretty much be covered.

On OSX the the procedure to get started is relatively simple. Copy the file you downloaded, and if it has an .osx extension in the filename, remove that extension. Modify the file to allow it to be executed (chmod +x), and then move it to the directory that contains your docker-machine executable/binary.

Now, you can start docker-machine and call the Nutanix driver, and that will give you a couple of additional options:

docker-machine create --driver nutanix 

Usage: docker-machine create [OPTIONS] [arg...]
 Create a machine

Description:
 Run 'docker-machine create --driver name' to include the create flags 
 for that driver in the help text.

Options:
 .... 
 --nutanix-endpoint Nutanix management endpoint ip address/FQDN [$NUTANIX_ENDPOINT]
 --nutanix-password Nutanix management password [$NUTANIX_PASSWORD]
 --nutanix-username Nutanix management username [$NUTANIX_USERNAME]
 --nutanix-vm-cores "1" Number of cores per VCPU of the VM to be created [$NUTANIX_VM_CORES]
 --nutanix-vm-cpus "1" Number of VCPUs of the VM to be created [$NUTANIX_VM_CPUS]
 --nutanix-vm-image [--nutanix-vm-image option --nutanix-vm-image option] The name of the VM disks to clone from, for the newly created VM
 --nutanix-vm-mem "1024" Memory in MB of the VM to be created [$NUTANIX_VM_MEM]
 --nutanix-vm-network [--nutanix-vm-network option --nutanix-vm-network option] The name of the network to attach to the newly created VM
 ....

Now, to create for example a VM based on an ISO image that is stored in your image configuration, you simply use one command:

docker-machine create --driver nutanix --nutanix-username 'docker-deployment-user' --nutanix-password 'P@ssw0rd' --nutanix-endpoint '10.0.0.50:9440'  --nutanix-vm-network production --nutanix-vm-image CentOS-7 Docker-CentOS

And you can obviously also add things like VM memory and CPU information. Once you hit enter, the driver connects to the Nutanix cluster and tells it what to do.

The result?

Screen Shot 2016-07-07 at 14.04.14

And you can manage your systems using docker-machine like you are used to. Now there is obviously more you can do, but I just wanted to give you a quick way to get started. So, have fun playing with it. 🙂

Containers, Nutanix

Getting started with CoreOS on Nutanix Community Edition

Containers seem to be the hot trend right now. I needed to get some more experience in this area, and instead of working with a single container machine, I actually wanted to get a “quick” distributed setup going. It wasn’t all that quick to start with, but I now have a working setup that can actually be rolled out and scaled in a pretty quick fashion.

Now, I’m assuming you already know what a container is, and have heard about CoreOS. Here are some quick steps to get you started. I’ll start off with the prerequisites:

  • You will have your Nutanix CE cluster up and running
  • You have a VLAN set up with IP address management and a DHCP server on Nutanix CE
  • You downloaded the CoreOS ISO image

Your further steps are relatively simple. First off, we will create an etcd master, the most important thing we need is a fixed IP, so define which IP you want to give it. Obviously we could use the CoreOS cluster discovery mechanism and rely on an internet connection, but I decided to just use my own instance instead.

Start off by creating a cloud-config file for your etcd master:

#cloud-config
ssh_authorized_keys:
  - ssh-rsa AAAAB3NzaC1...
coreos:
  etcd2:
    name: etcdserver
    initial-cluster: etcdserver=http://<etcd-vm-ip-here>:2380
    initial-advertise-peer-urls: http://<etcd-vm-ip-here>:2380
    advertise-client-urls: http://<etcd-vm-ip-here>:2379
    # listen on both the official ports and the legacy ports
    # legacy ports can be omitted if your application doesn't depend on them
    listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
    listen-peer-urls: http://0.0.0.0:2380
 units:
    - name: etcd2.service
    command: start
    - name: 00-eth0.network
    runtime: true
    content: |
      [Match]
      Name=eth0
      
      [Network]
      DNS=<your-dns-ip-here>
      Address=<etcd-vm-ip-here>/16
      Gateway=<your-gateway-ip-here>

Note that I’ve copied in the public ssh key from my laptop to get easier access to the VM. Now, save this file as a text file called user_data, and create an iso image using the ways described here. Copy that over to your container on CE using sftp to a controller VM on port 2222. You can use Prism credentials to authenticate.

Next step, create a new VM in Acropolis, attach the CoreOS ISO image as your primary CD drive, and the ISO you just created as the second CD drive, and power on the VM

Now, to create the actual CoreOS cluster, you create a second user_data file, that only contains the following:

#cloud-config
ssh_authorized_keys:
 - ssh-rsa AAAAB3NzaC1...
coreos:
  etcd2:
    proxy: on
    initial-cluster: etcdserver=http://<etcd-vm-ip-here>:2380
    listen-client-urls: http://localhost:2379
  fleet:
    etcd_servers: http://localhost:2379
    metadata: "role=etcd"
  units:
    - name: etcd2.service
    command: start
    - name: fleet.service
    command: start

For quick deployment, I’d create a VM that you use as a template to clone from. Give the VM the newly created file as the secondary drive.

Now, just create some clones, power them on and wait for them to get their IP. You should then be able to ssh into the machine using the “core” user, and check your cluster:

core@CoreOS-1 ~ $ fleetctl list-machines
MACHINE     IP             METADATA
1c24fc23... 192.168.96.248 role=etcd
8e974c05... 192.168.79.25  role=etcd
a899b944... 192.168.114.8  role=etcd
f916eb93... 192.168.3.179  role=etcd

With that, you can start implementing and rolling out your units:

core@CoreOS-1 ~ $ fleetctl load hello.service
Unit hello.service loaded on 1c24fc23.../192.168.96.248

core@CoreOS-1 ~ $ fleetctl list-units
UNIT          MACHINE                    ACTIVE   SUB
hello.service 1c24fc23.../192.168.96.248 inactive dead
Apple, Fusion, VMware

[Shorts] Fusion 7.1 cannot perform a P2V with iCloud Password on OS X Mavericks

Recently my parents switched to a Mac from a PC, and I helped set things up for them. The machine was quite nice, a new iMac Retina, and I helped my dad migrate their old PC to a VM on the new Mac. Now, this entire process is pretty straightforward. You put the Mac and the PC on the same network, input a four digit token (or input an IP and port), then authenticate with a user and password, and things should work.

Unfortunately, that wasn’t quite the case for me. The Windows system wanted to have UAC disable on the Windows 8 Machine, which isn’t a problem, but the error message wouldn’t go away. Problem was, that it was a pretty generic error message “A failure occurred”, without even so much as an error code that made any sense. The log files also didn’t work.

While I was guessing it might be related to a username problem (spaces in the username), I tried several things on the Windows side, and checked the user on the Mac as well. It was then I found out that under OSX Mavericks, you can now enable using an iCloud password, which was already set up (using a screenshot here of my MacBook Pro as an example):

iCloud Password
iCloud Password

Long story short, as soon as I used a separate local password for the user, the P2V migration worked like a charm. I enabled the iCloud password again, and the migration wouldn’t go through. Since I wasn’t able to find this in the VMware KB, I figured I might as well share this here.