All of Your Eggs in One Basket?
Every time a new version of a virtualization tool comes out, people get all excited about the possibility of reducing the costs of running their data centers. Most of them are thinking in terms of hardware consolidation, but virtualized environments also allow for new ways of handling resilience and recovery as well.
This is all well and good, but then you run into what I call "the new hammer syndrome". The idea is, after you buy a new hammer, everything starts to look like a nail. You keep thinking of ways to apply this new tool. Some of them are great. Others, not so much.
Virtualization and SharePoint
SharePoint, in particular, frequently seems like a ripe target for virtualization. It has a multitude of roles, which can be (and often are) distributed among many servers. Data center managers see all of this hardware and envision collapsing it onto a single box. And, SharePoint is officially supported in virtualized environments. It seems like a match made in heaven, doesn't it?
But, not so fast! Take a step back and think about what I just said. You are taking all of these SharePoint roles, and spreading them out among multiple servers. Now you want to take all of these servers and virtualize them back onto a single piece of hardware? Where is the sense in that?
Why did you build out that many servers in the first place?
Pausing while people try to gather back the pieces of their exploded heads from that bit of circular logic...
Profound, isn't it? Almost like a Zen koan.
On Over-Engineering a Solution
SharePoint will very happily run all of its roles on a single server (physical or virtual), if you want it to. So, why would you want to split the roles at all? There are really only two reasons (good ones, anyway) - performance, and resilience. (No, I don't consider being able to point at a monitoring station covering a half-dozen or more servers and saying "Look at all of the systems I manage in my SharePoint farm" a good reason.)
From a performance standpoint, some SharePoint roles are real resource hogs. The two big ones would be SQL Server and Search Indexing/Crawling. SQL Server, though not technically part of SharePoint, is hit pretty much constantly by nearly every SharePoint component, and so is almost always set aside on separate hardware. The search crawl process, though intense, is "peaky." In other words, it goes through cycles of short bursts of intense activity, followed by periods of near idleness. In comparison, the web front-end functionality is a cake-walk. A single WFE server can handle potentially many thousands of users without breaking a sweat.
In fact, a big, modern server can probably handle hundreds, if not thousands, of users even with everything except SQL Server running on it. (Naturally, this depends upon just what those users are doing.) So performance is rarely the real reason for splitting off most of SharePoint's functions, except in very large environments.
That leaves resilience. Resilience is the ability of a system to keep on going, even if a piece of it fails. By splitting the SharePoint roles onto several servers, and having multiple instances of the roles that face your users, you can create a system which can tolerate a failure of any one component with minimal short-term impact. It is possible to take this too far, however. It isn't a case of "if two are good, three must be better, and five are better still."
What good is having three or more web front end servers, when they all sit practically idle at singe-digit percent utilization - even during peak periods? Not very good at all. At a minimum, it is not a very efficient use of resources. This is the kind of thing that makes virtualization look really attractive.
Balance
So, is virtualization really the answer? Maybe. Or maybe not. Let's get back to that biggest of little questions - "Why?"
Why did you build out your farm onto multiple servers?
Did you build out this big farm because your usage is so heavy, you were stressing out anything less? Then you are almost certainly not a good candidate for virtualization. In this case, you've got your hardware optimized to its load. Virtualization is just going to add another layer, and if you're already fully loading your systems, you won't get any benefit from host sharing with other virtual servers. The only reason you might justify going virtual is to be able to quickly replicate a failed system from an image, or shift a running image onto another host. But can you handle the extra overhead?
Did you build out your farm for resilience? The minimum "fully" resilient SharePoint farm is a configuration I call "2.1+". That's two servers running as WFE and Query servers (along with Excel Services in Enterprise Edition), one application server running the Index role (of which there can be only one per SSP) and optionally running other duplicated roles as well, "plus" a properly specified SQL cluster. This configuration can handle thousands of users, even with modest (by today's standards) hardware. In fact, from a performance standpoint it is probably still overkill for most organizations. Here, you might find some room to virtualize. But be careful - you split these roles out in the first place to avoid having a single point of failure. Virtualizing them and simply placing all of the VM's on a single host eliminates that benefit.
And What About SQL Server?
I decided to write about virtualization because I've been approached several times in the last week or so with folks asking about virtualizing the SQL Server side of SharePoint. Up until now, even when virtualization has been appropriate for some SharePoint components, I've always advised against making SQL virtual. But VMWare has just introduced a new version for which their marketing message is claiming that SQL virtualization is now a good thing.
Frankly, I'm not convinced. My main concern is that SharePoint is a very heavy user of SQL Server. Again it boils down to your utilization. Are your SQL Servers already CPU or disk I/O bound? If so, virtualization isn't going to help matters. If not, then you might be OK. Even if you decide to virtualize computation, I would still avoid virtualizing the data storage without first doing extensive load testing.
Going Forward
Ultimately, all computer configuration involves trade-offs. Cost, performance, and resilience are three corners of one of those "pick any two" engineering triangle conundrums. Virtualization doesn't eliminate the trade-off, but it can shift the balance toward the lower-cost corner. Whether or not it is appropriate in your case will depend upon your server load and tolerance for risk. In the case of SharePoint, you can achieve many of the same results as virtualization by simply re-consolidating the roles that were originally split off.
Mark Twain once said: "Keep all of your eggs in one basket - then watch that basket!" If you do decide to virtualize, make sure you take appropriate precautions. The Microsoft Consulting Services UK SharePoint Team has written an excellent series of articles on SharePoint virtualization. I suggest checking that out for more technical details.