vSphere 5 vRAM impact by the numbers – 39% or 10%?

This is a slightly controversial post, but I hope people will take it in the spirit of curiosity and openness of information, rather than an attack on the new vSphere licencing model, and also sorry it’s far too long – I really need an editor!

For those who aren’t that involved on Twitter or the VMware blogs and forums, there’s been huge amounts of discussion over the recent licence changes by VMware for vSphere 5, in regards to the vRAM caps on per-processor licences.

On a personal level, as a fan of the vSphere technology, as a VMware certified professional, and as a VMware business partner, it’s fairly disappointing that while much of vSphere 5 was well discussed and trialled in advance, it seems the new licences have been much less “field trialled”, and though the new changes are much more in line with how VMware deals with service providers, the surprise of the move has distracted significantly from the positive improvements in the new vSphere version, and wasted much of the initial launch efforts.

While initially VMware were very quiet when people were first reacting to the vSphere licence changes, which VMware says will mean no cost increases for most customers and only large price inceases for a few customers, they’ve now started to come out with some firmed justifications and explanations, which in general sound reasonable.

One very useful blog post, Understanding the vSphere 5 vRAM licensing model includes a graph which attempts to answer how the exact final figures were arrived at, and I would highly recommend reading it first if you haven’t already.

While the graph is very clear, there’s no raw figures provided, and it seemed like those raw figures would give a much clearer anwser to the question of “How many people will actually be impacted by the vRAM change?”, so I sat down with Excel and paint.net to try and return to an estimate of the raw figures, which I’ve come up with below.

 

[table “1” not found /]

 

Now, I’m sure the figures aren’t perfect, but I think they’re pretty close, and return an average number of processors at 5.69, while VMware have got 5.7.

What my raw figures also show though, is a high number of entries are above the “5.7 VMs per Processor” average, 39% of them in total.

This is based on the total percentage below 6 VMs per Processor (for ease of calculation, plus it’s a handy error of margin), which comes to 60.3%, leaving 39.7% with a usage above 6 VMs per Processor.

While those 39.7% won’t automatically be charged more by VMware under the new vRAM rule, it certainly gives an estimate of the number of customers who will be either impacted, or impacted in the future by these rules.

More significantly, 10% of the results are for 12 VMs per Processor or above, these 10% of people are almost certain to be impacted and will be very unhappy right now – these are the people that the VMware account managers are going to have to earn their money placating.

Now I don’t believe huge numbers of people will switch from VMware to Microsoft or Citrix over this, and I’m pretty sure most existing customers will end up striking “No cost upgrade” deals with their sales reps, but it seems clear that these 10% of users should have been given significant warning about the change coming, and an opportunity to formally buy much cheaper upgrade deals, rather than have to go through the cycle of “Surprise, outrage, shout at sales, get the discount they should of been given on day 1”.

The most written two word phrases in English?

For a few years now I’ve had a website listing the longest words in English, so when I saw Google Ngram, I thought it could be fun to poke around with.

The data is based on the scans by the Google Books project, covering roughly the last 200 years and 500,000 published books, and contains the frequency of words and phrases, broken down by year.

Google don’t seem to mention are the most common phrases found, so I decided to work them out. I wrote some ugly but workable scripts, downloaded the data to a rackspace cloud server, and added up the results, limited to two word phrases (or word grams).

From the results, I’ve found the most common two word combinations used in English are:

Most common 2 part Word Grams
of the
in the
to the
and the
on the

See the graph output of the phrases, as you can see, “of the” appears more than twice as often as the next common, “in the”.

Kind of boring, right? So I looked further down, searching for the most common phrase which wasn’t just a pair of very short combinations and found inside the top 500:

united states
new york

See the graph output

So there you have it, it seems that the United States is the most written about thing ever, closely followed by New York!

If you’re interested in how I did the technical bits and pieces, let me know and I’ll tidy up the scripts a bit and upload them to github. Overall, the script took around 12 hours to download the 25GB of files, uncompress them, and compile the raw data into something quicker to query.

What’s the point of PaaS again?

A pretty simple tweet by Cote, an Industry Analyst at Redmonk, asking for sentiment about using PaaS services got me thinking, and at 4:50pm on a Friday, that’s a dangerous thing!

PaaS (or Platform as a Service), is a great concept in computing, though not a particularly new one – rather than manage all the base operating system, databases, application servers and so on that a modern application relies on, you simply deploy your own custom code to someone else’s pre-built infrastructure. You get all the benefits of low cost commodity computing (if you pick a cheap host..), without the headaches of having to build or operate your own management systems.

However, most of the new PaaS offerings out there seem to simply consist of a provisioning engine deploying your application onto a static set of infrastructure, often hosted on Amazon EC2 or their internal equivalent platform, without offering significant functionality of load balancing, dynamic scaling of resources, database performance options, disaster recovery and so on.

I don’t expect all these things to be free, or even cheap, but you should be able to turn them on and off as and when you need them, and things like disaster recovery plans are not really optional in a world of enterprise computing – simply writing in your DR plan that it’s the responsibility of your application host, and that you have no details of their own plans really won’t cut it!

There’s great new PaaS offerings out there, like Cloud Foundry that I’m a big fan of for rapid application development and deployment, but until the various service providers raise their game in terms of heavy lifting of auto-scaling, DR and so on, then there seems very little point in picking a PaaS service over building your own infrastructure on top of a typical IaaS platform.