Autonomic cloud management approaches

The platform services segment of cloud is multi-faceted… to say the least. Lately, likely spurred on by announcements like IBM Workload Deployer and VMware Cloud Foundry, I have been thinking quite a bit about one of those facets: environment management. To be clear, I’m not talking about management tools for end-users, though that topic is worthy of many discussions. Rather, I’m talking about the autonomic management capabilities for deployed environments.

Put simply, I define autonomic management capabilities as anything that happens without the user having to explicitly tell the system to do it. The user may define policies or specify directives that shape the system’s behavior, but when it comes time to actually take action, it happens as if it were magic. Cloud users, specifically platform services users, are steadily coming to expect a certain set of management actions, such as elasticity and health management, to be autonomic. Increasingly, we see platform service providers responding to these expectations to create more intelligent, self-aware platform management capabilities for the cloud.

Now, it may be tempting to say that users would not need to know much about the way autonomic management techniques work. Beyond knowing what capabilities their platform provider offers and how to take advantage of those, the end-user can be blissfully unaware. That’s the point, right? I agree up to a point. The user probably does not need to know much about the algorithms and inner workings that carry out the autonomic actions. However, I do think the user should be aware of the basic architectural approach used to deliver this kind of functionality. After all, the architectural approach has the potential to impact costs (in terms of resources used), and it certainly will impact the way you begin debugging system failures.

When it comes to architectural approaches for providing these self-aware management capabilities, it seems to me that a few different philosophies prevail in the current state of the art. First off, there is the isolated management approach. In this case, there are separate processes, actually they are usually even separate virtual containers, that manage one or many deployed environments. The main benefit of this approach is that the containers running the application environment do not compete for resources with the processes managing that environment. The management processes are completely separate. They observe from afar and take action as necessary. Of course, there are drawbacks to this approach as well. Chief among them is the fact that the management and workload components scale separately. Presumably, as the workload components scale up the management components will have to scale up as well (surely not at a 1:1 ratio, but there will be an upper bound to what a single management process can manage). Additionally, one has to manage availability of both the management and workload components separately. All of these factors can increase resource usage and management overhead.

Another architectural approach in this arena is the self-sustaining approach. Here, the system embeds management capabilities and processes into each deployed environment. This removes the need for an external observer, means that management capabilities scale with your deployments, and eliminates the need to manage availability for separate components. These facts can contribute to reducing the overall management overhead. The main drawback in this case is the fact that the management processes can potentially compete with your application processes for resources. If the platform service solution you are using takes this approach to delivering management functionality, my advice is simple: do not assume anything. That means don’t assume that management processes will adversely affect the performance of your application workload and don’t assume they will not. Test, test, test, test!

As always, there is a middle ground, a hybrid approach if you will. In this case, every deployment creates a running application environment with some amount of embedded management capability. The embedded management components can take some actions on their own, but rely on an external component for other actions or raw information. This is actually quite popular in virtual environments since it can be tough for processes running in a guest virtual machine to get details about overall resource usage on the underlying physical host. Instead, processes running in the virtual machine call to an external component that has visibility of resource usage at the physical level. This approach affords a compromise between the higher management overhead of the first approach and the dueling processes of the second approach. That said, it comes with a certain amount of drawbacks from both approaches.

I am not necessarily advocating for one architectural approach over the other. I certainly think some are better approaches for certain scenarios, but I do not see a silver bullet answer here. I simply think that users should be aware of what their particular choice of a solution does in this respect and plan (and test!) accordingly.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: