Big Data: Inside in-memory analytics

With the cost of accessing data in memory no longer prohibitive, analytics solutions now offer companies affordable, real-time business operation insights By Jasmine Desai

When Godrej Consumer Products wanted to implement an analytics solution, it had only one focus in mind: to have overall information visibility, especially in view of the fact that the amount of data a business has to deal with is growing exponentially. Says Subrata Dey, CIO, Godrej Consumer Products, “The challenge is how to use the data to get value out of it. Our focus was on user interface, robustness of the solution and scalability. Also, there should be no deployment hassles and it should be quick to deploy.”
The organization zeroed in on Qlikview’s Business Discovery platform, which is an associative in-memory architecture and went live with the solution in 2-3 months after selection. Currently, it is being used for sales analytics and has around 500 users. However, in future it will be used for finance, marketing and supply chain management functions as well. “When organizations go for mass implementation, it is better to do it in a structural manner and eventually grow from there. The solution is very intuitive in nature, so no particular training was required and it has a lot of ad-hoc capabilities. For us, it has been a win-win situation,” he says.
In-memory analytics has been the most talked about architecture this year and the most misunderstood one. Analytics is the new building block of business decision making in a majority of progressive organizations. A recent study on “data-driven decision making” conducted by researchers at MIT and Wharton provides empirical evidence that “firms that adopt data-driven decision making have output and productivity that is 5-6% higher than the competition”. Naturally then, business analytics is a top priority for CIOs and finance executives. However, traditional analytics has its limitations that get exposed in the face of evolving trends like big data, mobile applications, and cloud computing. CIOs today, are looking for systems and tools that sift through the deluge of big data to provide fast, interactive, insightful, real time analytics.

As per a Forrester report, The Future of Customer Data Management, until recently, putting data in memory was not an option because it was prohibitively expensive compared with disk space. To support granular personalized customer experiences, in-memory data is critical to delivering faster predictive modeling, enabling real-time data access, processing big data quickly, and offering new customer insights and opportunities that used to be impossible to get. Customer data stored and processed in memory, helps create an opportunity to up-sell and cross-sell new products to a customer based on their likes, dislikes, circle of friends, buying patterns, and past orders. Key technologies that can help deliver faster insights and real-time customer experiences, include in-memory platforms and event processing platforms.

In-memory analytics is a way of executing analytical procedures. Traditional analytical engines used to conduct a lot of heavy lifting of data from the “storage” layers into the “processing” region and then again pass back the results into the storage layer for presentation by the applications. This used to create a lot of resource demand on the processors to do activities of lower criticality most of the time and had lesser bandwidth available for the real stuff which was analytics. “That whole game had been changed by the advent of in-memory analytics. Here, the processing and the data are brought closer to each other. With the development in processor speeds and also of the data I/O of RAM, the data processing is done in the RAM and not in the storage layers (hard disks). The result sets are also stored in the RAM and hence visualization and presentation is nimbler,” says Dinesh Jain, Country Manager, Teradata India Pvt Ltd.

As per Vikash Mehrotra, Sales Consulting Director – EPM/BI, Oracle India, “Speed is of essence here and this is where in-memory analytics comes into play. In contrast to traditional disk-based processing, in-memory processing makes it possible to do complex analytical computations on large sets of data in a minimal period of time.” This approach makes interacting with and querying data blazing fast—and makes real-time analytics possible. It reduces or eliminates the need for data indexing and storing pre-aggregated data in OLAP cubes or aggregate tables.

Mythical challenges?
In-memory analytics is definitely surrounded by challenges, but the real ones always seemed to be ignored. Challenges like dearth of skill-set is very often highlighted, but is it real? As per Jain of Teradata, “In-memory analytics is just a manipulation mechanism to use your hardware and infrastructure resources in a more efficient way and any in-memory optimized analytical engine is capable of doing that automatically. No new skill set is required. It may just be more of a budgeting and hardware sizing and data governance decision.”

According to Akhilesh Tuteja, Partner, KPMG, “Many times when organizations look for talent pool they only look for people who can configure systems and those with technical skills. One needs to have functional skills as well.”

Says Rajesh Shewani, Technical Sales Lead, IBM, “Lack of skills is a myth, because you do not need to train anyone on in-memory analytics, but on the specific solution. He needs to be trained on the analytics platform.”

A very important component of analytics is data visualization. Says Savita Kirpalani, Chief Analytics Officer, Rewire, “Indian organizations are not savvy about visualizations. There are so many charts but there is no skill to interpret it. How many customers can you have with such data? There has to be an appeal in the way data is presented and what you see: only then will you want to go further. Vendors need to have the ability to sell with the data that is available.” Another recurring problem seems to be around unclean data. Data needs to be synchronized properly. Especially because it is often stored on different systems in different locations. So, the challenge is how to get it together to make an implementation successful.

From the investment standpoint, organizations tend to make large investments at the outset and provide the solution only to a few back-end analysts. Also, enough business value is not attached to the implementation. As per Sanchit Vir Gogia, Chief Analyst, Founder & CEO, Greyhound Research, “CIOs should look at building a COE (Center of Excellence). This can bring IT and users together. They can redefine the testing pattern.”

According to Manish Sharma, Head of Database and Technology, SAP India, “The most critical factor is the trust in data. The reason you are taking data out is because you do not want to load the application with too many users. It is not only about speed but also what value it gives to the business process. The concept of data is a process issue and not so much to do with technology. For many customers it is a starting point to get clean data.”

Making way for in-memory
One should understand that in-memory is not a magic bullet. There are limits to how much data you can process in the RAM and hence a hybrid approach based on hot data (often used) and cold data (not so frequently used) is the most appropriate for a large enterprise. Hence organizations should have clear policies of tagging data and should have clear focus on “what” in-memory analytics is being used for.

Selecting an in-memory analytics solution is not an easy task. To begin with, in-memory architecture is divided into different segments, with different vendors offering solutions in each. Thus, there is in-memory OLAP, in-memory ROLAP, in-memory associative index, in-memory inverted index and in-memory spreadsheet. Says Gogia, “Thus, some are better at handling query, some less physical memory. Depending on the maturity of data one should select the solution. One has to see the load-time, how is the access to third party tools, interface, memory optimization. Also, what is the vendor’s approach to fit into that allotted space of memory?” Thus, organizations have to look out for product capability and how to make it work in terms of how to improvise it over time, how to maintain it and how to configure it. The decision to buy a certain solution due to an old affiliation to a vendor is a very wrong way to go about it: it can cause a huge implementation failure.

In terms of product capability, the features required are different for different organizations. Large organizations, with long term play, will have lots of needs. But smaller organizations tend to go for small solutions that are easy to implement. Says Tuteja of KPMG, “Organizations have to see how does it fit into transactional systems like ERP, CRM and how good is its visualization capability. Unlike the transactional system, where you have to use it because it is integral to the organization, with BI, one can tend to not use it. Thus, ease of use is critical.” One can start on a small scale and grow big. It could start with something as low as Rs. 40 lakh for a small tool with end-to-end implementation.

Organizations can get full value when they design applications to leverage full capability of the in-memory architecture. They should start to push calculation in the analytics layer itself because that will allow the solution to process data much faster. Sudipta K. Sen, Regional Director – South East Asia, and CEO & MD – SAS Institute India Pvt. Ltd. says, “A crucial parameter for any organization, while deciding to select a particular in-memory solution, is the ability to perform exploratory analysis on not just a sample set or subset but entire data. Another vital aspect is ‘data visualization.’ It is also pertinent that enterprises avoid data quality and integrity issues. They must manage the relationship between in-memory analytics tools and data warehouses by implementing systems and processes to ensure clean data so that queries yield quality information.”

As per Mehrotra of Oracle India, “Organizations should provide for high performance and scalable analytical sandboxes. When a problem occurs, humans can solve it through a process of exclusion. And often, when we do not know what we are looking for, enterprise IT needs to support this ‘lack of direction’ or ‘lack of clear requirement’. We need to be able to provide a flexible environment for our end users to explore and find answers.”

The Indian market
India’s digital universe is expected to grow by 50% every year through 2020. The digital bits captured or created each year are expected to grow from 127 exabytes to 2.9 zettabytes during this period. However, less than half a percent of this data is analyzed today, whereas 36% of it could provide valuable insights. This makes big data analytics and BI solutions like in-memory analytics a significant opportunity for India.

As per Jain of Teradata, “There are a lot of companies which are coming up with visualization tools focused on ‘self-discovery’ based analytics. In India, most corporations have heavily spent on traditional analytics and have stabilized them across the years. Now is a good time for them to start exploring this new mechanism of gaining maximum business value from their frequently accessed data.” However, the importance of the underlying infrastructure to support it becomes really important and an infrastructure audit before embarking on an in-memory journey is advisable.

As per Sen of SAS, “In-memory analytics uncovers lucrative opportunities not only from an enterprise standpoint but also from a state and central government perspective.” In-memory analytics is especially gaining traction because if offers the opportunity to do super-fast, real-time analytics at an affordable cost. “The growth in 64-bit operating systems and declining cost of RAM makes it possible for even small size companies in the country to deploy in-memory analytics solutions,” says Mehrotra of Oracle India.

jasmine.desai@expressindia.com

Comments (0)
Add Comment