Archive for the ‘data’ Category

The growing relevance of In-Memory Data Grids

July 12, 2011

The growing consumer affinity to cloud is spurring on various new technological trends. It’s not all new technology mind you, but there seems to be a growing appetite for anything that can remotely be put into the context of cloud computing. In some ways, cloud has been good for bringing previously existing technologies back to the forefront and resulting in needed innovation in the area. Besides virtualization technologies, I believe that one of the best examples of this is in-memory data grid technology.

Despite what some may try to purport, in-memory data grids are not a by-product of the cloud computing revolution. The truth is they existed for a while before cloud, but to be fair, IMDGs probably owe a tip of the hat to the cloud computing craze for bringing them back into the spotlight. Increasingly, we are seeing workloads that are highly scalable, temporal, and elastic making their way into the cloud. These application characteristics often align nicely with the use of IMDGs, so we see renewed interest and quite a bit of innovative activity around these solutions.

I have been spending a lot of time lately working with users that are in the process of defining an evolution to their current enterprise middleware environments. In this work, cloud and IMDG technology has been front and center. My last few posts have been dedicated to talking about some of the cloud trends I have seen during the course of this work, but today I want to focus on IMDG solutions. Specifically, I want to share with you what I’m seeing in terms of how users are currently looking to use this technology, and provide my own thoughts about possible usage scenarios going forward. Let’s start with the common, currently targeted usage scenarios:

1) IMDG as a database buffer: This is perhaps the single most common use case. Here users look to front traditional data stores with a distributed IMDG. This can serve to increase the performance of the application and thereby improve end-user experience by offering faster data access. It can also help to reduce the pressure on the backend database by making the IMDG instance the system of record and periodically batching changes to send off to the database. There are many different techniques for buffering an existing database with an IMDG, but the motivations (increase performance, decrease database reliance, decrease costs) are usually much the same.

2) IMDG as a simple cache: You may hear this referred to as the side cache scenario as well. This usage pattern is a little less intrusive to existing application architecture than #1 above. Here, applications receive an incoming request that requires data, and they first check in the simple cache instance to see if the data exists there. If not, the application proceeds to retrieve the data normally, typically from a relational database system, and then inserts this data into the simple cache. Obviously, when the application finds data in the simple cache, you reduce the path of the application and thus decrease overall response time. This is an especially prevalent pattern for storing conversational state (think HTTP sessions) for applications.

3) IMDG as a service request cache: This is really a variant of #2 above, but I call it out separately because users typically implement it at a different tier in the application architecture. Instead of updating the application to be aware of some IMDG for a simple cache, users insert this awareness further up the stream, often at the ESB tier. Requests come into the application environment, but before they hit the application, some component dissects the request to determine if it can fulfill it from the cache. If so, the entire path becomes significantly shorter, and if not the mediation component inserts the response in the simple cache on the way back out. Again, it is not very different than #2, but in my opinion it is worth distinguishing as it occurs at a completely different tier in the overall architecture.

Those are three common patterns that I run into quite frequently today with IMDG technology. There are a couple of more usage patterns that I have seen once or twice, but are not yet prevalent. ‘Yet’ is the key word here as I tend to think we will be seeing more of these use cases in the near future:

1) IMDG as an event filter: To be clear here, I am not suggesting that IMDG instances would morph into event processors. There are completely separate solutions that do that very well. What I am talking about is using the distributed logic processing capabilities shipped with most IMDG solutions to quickly determine if a given event needs further processing. In this way, events that occur in the enterprise can flow through an easily scalable IMDG filter, and then only if necessary sent to an event processor for more expensive computation. In the increasingly event-driven architectures that are emerging today, I feel this could become a popular pattern.

2) IMDG as a map/reduce engine: Many IMDG solutions deliver the capability to distribute logic, as mentioned in #1 above, and many offer a map/reduce model as a means to do this. As skills on map/reduce programming start to permeate enterprise IT shops, I think users will see compelling use cases for IMDG built primarily on this methodology. The ability to quickly distribute logic, calculate results, and further refine those results all out in the grid is a powerful tool to have in your arsenal. It can in fact be quite liberating to leverage the processing power of a distributed grid to solve important yet complex business problems.

Now, I have no idea if the two use cases above will in fact pan out as mainstream in the near future. I see massive potential there because of their alignment with emerging architectures and their ability to deliver real business value.  However, I can also imagine IMDG technology taking off in entirely different directions. Whatever the near future holds though, I think one thing is certain: We are just beginning to explore the art of the possible when it comes to IMDG technology.

Catching up with the Cast Iron team

September 9, 2010

While many enterprises are expanding their use of off-premise, cloud-based applications and application platforms, it seems some gloss over an important point. These same companies have, and will continue to develop, on-premise applications in both cloud and traditional environments. Considering the ongoing activity in both off-premise and on-premise, it is only a matter of time before a given company has the need to integrate data between their on-premise and off-premise applications.

The ability to do this data integration, thereby enabling hybrid clouds or hybrid enterprise architectures, is exactly where Cast Iron excels. That is why from an employee perspective, particularly one that works with cloud computing, it was exciting to see IBM’s acquisition of Cast Iron earlier this year. Recently, I had a chance to catch up with Chandar Pattabhiram (Cast Iron’s Vice President of Product & Channel Marketing) to ask him a few questions about an upcoming Cast Iron podcast.

Me: It seems that Cast Iron Systems are about enabling hybrid architectures. They enable users to connect off-premise cloud applications with on-premise cloud or traditional applications. Now for the obvious question: What are the drivers for the need to connect on-premise and off-premise applications?

Chandar: Companies are rapidly adopting cloud applications –the industry projects to exceed $20B in the next few years. However, the same customers who are adopting the cloud have already invested millions of dollars in on-premise applications like SAP, Oracle or even homegrown solutions. As a Fortune 500 customer recently told me, “just because I like Salesforce.com doesn’t mean that I’m going to get rid of SAP for my financials.” Therefore, we have a hybrid world of off and on-premise applications. More and more cloud customers are realizing that using Cast Iron Systems to integrate this hybrid world is the sure-fire way to maximize the economic value of their cloud investment. Why? With Cast Iron integration, cloud users no longer have to do the “swivel chair” approach of accessing multiple systems — they now have real-time visibility of data previously locked away in other enterprise applications.

Solving this problem is not as simple as it may seem. According to recent studies by Gartner, Forrester and Saugatuck, CIOs rate application integration along with security as the top reasons why they shy away from the cloud. Cast Iron Systems has enabled hundreds of customers to solve this problem by offering a complete cloud integration platform. The value of Cast Iron to these customers has been twofold – they have maximized their productivity and increased their adoption. The result? Cloud providers increase their “stickiness” and customer loyalty. Simply put, Cast Iron has become the customer loyalty application for the cloud.

Me: Cast Iron Systems lists its core capabilities as connectivity, transformation, logic, and management. Can you elaborate a little on each of those?

Chandar: The Cast Iron solution has been built from the ground up to provide exactly what you need for cloud integration. The key is simplicity – provide only what users need rather than all the extra bells and whistles that no one ends up using anyway. As you say, the four features that are a must-have for cloud integration are connectivity, transformation, workflow and management.

First off, the Cast Iron solution provides connectivity to hundreds of cloud applications, enterprise applications, web services, databases, flat-files, etc.

Beyond connectivity, the Cast Iron solution enables you to graphically map data between source and target applications. For example, if a data field called customer number is alphanumeric in your on-premise ERP and corresponds to a numeric field called account number in your off-premise CRM, you can graphically transform these so both applications interpret this as the same information.

The Cast Iron solution also enables you to graphically define the flow of data between source and target applications. For example, you can graphically define the required steps to extract customer data from your on-premise ERP system and send it to different cloud applications.

Wrapping up the core capabilities, the Cast Iron solution provides you with one cloud-based console to manage your integration. This enables complete visibility to data flowing across both your on and off-premise environments. Think of this as similar to the ability to track packages you ship with companies like FedEx or DHL.

Me: Can you tell us what Cast Iron Preconfigured Templates are and what they mean to the end user?

Chandar: With thousands of successful customer integrations, we leverage a wealth of integration experience to provide a comprehensive set of TIPs. These TIPs are offered for the most common integration scenarios between a number of enterprise applications like Salesforce.com, SAP, Oracle, etc. and eliminate the need to build your integrations from scratch. You can simply log in via your browser, select the template that best suits your requirements and enjoy proven, supported and certified processes. You can further customize these TIPs to meet your specific needs using a simple configuration wizard. I call this the “Turbo Tax” approach to integration. For those of us brave enough to do our taxes, we don’t start off with a 1040 form. Instead, many of us use Turbo Tax wizards to answer the right questions and what we get is a customized tax form for us; this is the same experience the Cast Iron configuration wizard provides. What this means to a customer is that they are able to accelerate their time to production and be live in literally days rather than weeks or months.

Me: Tell us a little about your favorite customer success story with Cast Iron.

Chandar: There are many stories to tell. Let me give you a couple of success stories – one in a Fortune 500 company and one in what we call the general business sector.

A Fortune 500 pharmaceutical product distributor replaced various traditional systems with Salesforce.com as the CRM application for their call center service representatives (CSRs). After doing so, the challenge was then to empower all of their CSRs with real-time information in Salesforce.com, thus enabling them to deliver a superior customer experience. Historically, the CSRs spent hours collecting this information by accessing multiple applications, which resulted in a significant loss of sales productivity. The IT team deployed Cast Iron to connect their SQL-based homegrown data warehouse with Salesforce.com in real time. This solution created a 360-degree view of customers in real time. The customer implemented the entire integration solution in just ten days. The Cast Iron solution saves the company $250K annually and IT staff previously dedicated to cloud integration can now focus on strategically oriented, innovative projects that can lead to new revenue streams for the company.

A $2B manufacturer of consumer devices has a wide range of cloud and on-premise applications including solutions from SAP, JD Edwards and various others. They chose Salesforce.com as their CRM platform with the goal of delivering a superior customer experience. They wanted to use Salesforce.com as the single application that provided a seamless, 360-degree view of their customers and maximized the productivity of their sales and technical service teams. With Cast Iron, they performed integration between Salesforce.com and the on-premise systems including SAP, JD Edwards and flat files. Now the technical service teams no longer have to log in and manually access the information in back-office ERP systems. Again, the first SAP to Salesforce.com integration project took only 10 days to complete. The company benefits from approximately $210K in savings each year by eliminating ERP licenses, improving productivity and minimizing integration implementation costs.

Me: Thank you Chandar!

To hear more about Cast Iron and other cloud solutions in IBM WebSphere, check out the ongoing Enabling cloud computing with WebSphere program.

How will data grid centralization affect usage patterns?

September 2, 2010

Maybe I’m just a geek, but to me, in our ever growing, massively scaled enterprise computing landscape, there are few technologies that peak my interest like memory-based data grids. It is nothing short of amazing to see an increasing number of enterprises use these solutions in a myriad of ways, all to solve an old dilemma: How can one efficiently and cost-effectively scale data while preserving quality of service characteristics of the data such as performance, availability, consistency, and manageability? In my view, the use cases emerging from these solutions are among the most creative and intriguing that we see in the enterprise computing world today.

At a recent conference, I was chatting with two colleagues that work with customers interested in and implementing some of our IBM data grid technologies. I took away a couple of things from that chat. The first is that the adoption of data grid technologies is definitely on the upward swing. Enterprises are investigating various data grid solutions, and they are putting them to use in continually growing numbers. Some of this growing adoption probably owes to the increasing importance of elastic data tiers in cloud computing architecture. In other cases, enterprises have simply come to a tipping point where the current way they interact with their data is inefficient and costly.

The second takeaway from my chat was not as much about adoption as usage. Both of my colleagues, who I should say work with different users in different industries and geographies, agreed on a common emerging usage scenario. More and more enterprises are reducing their overall number of grids, and they are creating grids with a variety of data served up to numerous applications. From a management standpoint, this certainly makes sense. While effectively partitioning such diverse sets of data may be a bit more of a burden, the administrative efficiency gained by managing fewer grid instantiations probably makes it worthwhile (especially when you consider that many data grid technologies provide quite a bit of partitioning know-how).

In some ways, this natural evolution to fewer, more highly heterogeneous grids seems inevitable. In that sense, it is interesting to think about what this emerging usage pattern may mean to the evolution of data grid use within the enterprise. Specifically, what will enterprises start to do differently in adopting this type of data grid usage?

Now, I’m no data grid expert, and I did not stay at a Holiday Inn Express last night, but that won’t keep me from giving my opinion on this matter. As I see it, the move to more centralized data grids will dictate the need for enterprises to more tightly align data grid and enterprise service bus technologies.

As data grid centralization increases, the number of different applications accessing a single grid also increases. It is conceivable, and in fact likely, that applications written in different languages and running on different platforms will need to access and perhaps modify the same data within the grid. The data as it is stored in the grid may not be an ideal format for all of the applications. Certainly, the applications could retrieve the data in some sort of least common denominator form (i.e. XML) and do the appropriate transformations. However, that leaves you with application code that does data transformation work before it actually works on the data. Clearly, this is not ideal. Instead, enterprises may benefit from using an ESB to transform the data returned from the grid to a format more desirable to the requesting application. This moves transformation code out of the application, where I think we all agree it does not belong, to the ESB.

In addition to the data transformation capabilities ESB solutions afford, they also typically provide protocol transformation capabilities. This may come in handy as the number of applications, and presumably protocol access methods, accessing a single grid increases. Instead of requiring a data grid to accommodate untold number of access protocols, the ESB could intercept various different types of request, and then send that request along to the data grid using a supported protocol. While data grid support for commonly used protocols like HTTP may mitigate this need to some degree, I still believe this will afford enterprises and their applications a nice degree of flexibility.

Now, I am not putting my head in the sand about some of the effects of using data grid and ESB technologies in tandem. Chief among the concerns would be the fact that an ESB between applications and the data grid would certainly introduce some degree of latency. In many cases, users leverage grids for extremely fast access to data, so this pattern may not be ideal. However, in other cases the value provided by the ESB/data grid combination may outweigh any latency concerns. So, what do you think? Am I way off base here, or is there value in architectural patterns that combine ESB and memory-based data grid solutions?


Follow

Get every new post delivered to your Inbox.