With data center environments becoming quite complex, data center infrastructure management solutions offering a single pane view offer a way out for harried administrators. By Harshal Kallyanpur
However, innovations in technology and the adoption of advanced IT systems have brought along complexities that have made data center management a tedious process. Traditionally, data centers have been rather static in nature. As a lot of organizations went on to be IT enabled, they set up a data center with a few or many physical servers. They followed a straightforward approach in managing these setups.
An IT team would manage the server sprawl while the facilities team took care of the power and cooling and building management. The two teams would interact with each other only when one team was tracing an issue to a component being managed by the other team or while making changes in the data center. These teams lacked real-time insight into each other’s activities.
Based on an estimated forecasting of energy requirements, organizations would often overprovision power and cooling capacities and hope that they could meet demand for the next few years. They would use OEM software management solutions supplied by the data center equipment vendor to manage the respective data center component.
Impact of economic change
According to Dhir, as businesses grew and the economy underwent changes, the focus for organizations shifted to reducing costs, improving operational efficiency and increasing workforce efficiency. Organizations had to come up with intelligent approaches to not only setting up and managing IT, but also managing power and cooling and the overall data center infrastructure.
With economic change, space started coming at a premium and power costs began to soar. Organizations were finding it hard to find space to set up additional server capacity and, at the same time, provision power in the most cost-effective manner.
“However, in the last five to six years, power costs have risen across the globe and it has become a major bottleneck in data center planning and management. For our data centers that are located within the city of Mumbai, while the cost of space is a constraint, power costs today form a large chunk of our data center budget,” added Kulkarni.
Virtualization & blades
With the aim of saving on hardware, space and power costs, organizations turned to virtualization and looked at consolidating several physical servers onto virtual machines running on fewer physical servers. They also hoped that combining server resources into a single pool would make IT management easier.
According to Dhir, most Indian organizations at a mature stage of IT adoption have virtualized with the aim of reducing their server footprint. They have also adopted cooling approaches such as hot-aisle, cold aisle cooling systems to better manage cooling and energy consumption.
To further reduce the space requirements, organizations adopted blade servers that toned down a server to a smaller form factor such that many more servers could be accommodated in the same rack space. However, this only contributed to worsening data center management issues.
Kulkarni of Netmagic said, “A rack that would typically hold four to eight servers can take up to 32 servers today. While the space required for racks has reduced, the compute density per rack has risen. Due to this the power density per rack has also shot up. Not only does the power required per square foot go up but the increase in power also causes an increase in the heat density, which means that the servers’ cooling requirements go up.”
Also a virtualized server environment is highly dynamic in nature, with somewhat frequent changes in workloads. Due to this, the compute requirements fluctuate and this, in turn, makes the power requirements dynamic. It becomes difficult to manually correlate changes at the IT level with the required changes at the infrastructure level and this calls for some degree of automation.
Traditional solutions fall short
Organizations initially chose to use the software-based OEM solutions provided by the data center equipment manufacturers to manage their respective data center components. As IT systems management solutions became the norm, the server infrastructure started getting managed better with solutions such as Tivoli from IBM and OpenView from HP.
At the building infrastructure and power and cooling side, a need was felt for a unified way to monitor and manage these components. It was then that organizations turned to building management systems or BMS. These software-based solutions interface with power and cooling components and even other infrastructure components such as physical security and access control systems to enable the integrated management of data center infrastructure.
However, data center management was still carried out in silos as there was little or no interaction between the facilities team that used the BMS and the IT team that used the IT systems management software. While the BMS solutions throw alerts and raise tickets every time that an exception occurs, there is no integration between the BMS and IT management solutions for a meaningful correlation and a defined course of action.
Harjyot Soni, CTO, Tulip Telecom, said, “In the traditional data center environment, server racks often had power capacities overprovisioned for them. As these servers were not running at their full utilization levels, power requirements for every server were not very high. Therefore, additional servers could easily be provisioned within the available server racks without any need for additional power.”
“However, today, with virtualization and blades, most servers are running at their optimum utilization levels and consuming a lot of power and it is impossible to simply add a new server to any rack. It is critical that you know what the power consumption, availability and threshold limits and the current heat load are on a per rack level.”
Chube of Emerson shared this view. He commented, “Let us assume that you have a server rack with a heat density requirement of 10 kWh and, two UPS systems of 10 kWh each that cater to this requirement and ensure redundancy. If you stack additional blades such that heat density goes up to 15 kWh, you may still do so from an IT point of view thinking that you have 20 kWh at disposal to handle the increased heat density.”
“However, if one UPS fails, the UPS becomes the single point of failure for the servers. From a BMS point of view, you can supply 20 kWh. From an IT point of view you have 20 kWh for usage. However, there is no link between the two to correlate and tell you that your heat density requirement will go beyond your current level of redundancy.”
Additionally, while a BMS takes information from various security systems in the data center, on an unauthorized physical access to an array of racks, it does not have the built-in intelligence to talk to the IT management system to restrict system access to servers within the racks.
DCIM to the rescue
According to Chube, Data Center Infrastructure Management (DCIM) solutions integrate BMS and IT management systems into a single pane and provide an array of information to help manage data centers more effectively. Solutions such as Trellis from Emerson take inputs from all components within the data center to provide a historic inventory of every component in the facility, its consumption levels, health etc. It also defines process flows to ensure that all changes or modifications and response to issues in a data center are carried out in a defined manner with all respective stakeholders informed and involved in the activity.
For instance, Trellis holds a library of products from grid to chip, such that it can, for example, give the complete specifications sheet of the server that IT wants to deploy. If an organization were to deploy a particular number of servers of a particular type, the solution will calculate and give the exact heat density for the rack that will hold the servers, how the servers can be placed in the rack etc.
Chube said that DCIM solutions also interfaced with the IT systems management solutions to understand server utilization levels. They monitor the heat load for every server in every rack along with the compute utilization in order to suggest shifting of workloads or moving all new workloads to available servers in a particular rack or across racks.
According to Soni of Tulip Telcom, a DCIM solution could also perform what-if analysis in terms of how changes in workloads would affect the total heat load in a server rack, availability of rack space to add new servers when required and how a server taken off the rack would affect the overall power and heat load on the rack and others.
The solutions give a measure of power consumption right from the power switches to the server. This helps during maintenance as the data center manager can simply traverse through the power tree to determine cause of failure or the exact server that needs to be brought down for maintenance.
Using DCIM tools, applications and databases along with server and data center infrastructure can be monitored and managed effectively. These tools use correlation engines to map various components of the data center and build a component landscape that defines relationships between each of them along with the interdependencies.
DCIM adoption and the way ahead
Many organizations in India are still in the process of deploying or using BMS and IT systems management solutions in a somewhat automated fashion for managing data centers. DCIM demands a considerable amount of investment and organizations are currently focusing on using their current investments in order to manage IT. They are integrating alerting and ticketing processes with their BMS solutions in an attempt to get somewhat of a holistic view of their data center for management.
DCIM is currently seen by consumers as a value add to the solution stack and not as a norm. Currently, most users are concerned about green initiatives and efficient energy utilization. The OEM solutions from power and cooling equipment providers are able to meet most of these requirements and, therefore, DCIM currently does not have strong demand from customers.
Kulkarni of Netmagic said, “DCIM solutions have seen a good amount of adoption in mature IT markets such as the US. In India, however, BMS itself has seen adoption only in the last five to ten years and organizations are in the phase wherein they have deployed or are deploying a BMS solution and are now looking at integrating these systems with a ticketing and alerting function.”
Furthermore, DCIM solutions are apt for managing large scale data center environments such as those with telecom service providers or data center services providers such as the likes of Netmagic and Tulip Telecom. Data center complexity is much higher for these centers and, therefore, demands a highly automated mechanism for managing data center infrastructure.
Both Kulkarni and Soni of Tulip Telecom were of the view that data center services and managed services providers were primarily looking at adopting DCIM solutions to improve the management of their own data centers.
Furthermore, having such a solution running in the background helps them reinforce their commitment to the customer providing high quality, well managed data center services. Their customers, however, are not asking for such solutions yet, although the trend might change in a few years.
Soni said, “Customers using infrastructure-as-a-service have varying IT requirements and we bill them as per their changing compute requirements. For instance, in the typical billing model, if a customer wants to be billed for using four CPUs, we bill him for four even if he uses only two. However, we are moving to a model wherein if a customer uses only one CPU out of the four, we bill him only for using one. Power or energy costs are a part of the compute costs and we currently do not bill customers on power costs based on dynamic consumption.”
Kulkarni of Netmagic shared a similar opinion. He said, “Currently, customers choosing data center services are charged a flat rate for power costs. However, customers are asking for utilization-based billing for IT since they know that their workloads vary. In a few years, we will see customers asking for utilization-based billing even for power consumption as they also know that varying workloads also means varying energy requirements.”
He was quick to add that even from the service provider’s point of view, provisioning static power and cooling capacities for customers with dynamic IT requirements was not cost-effective as these capacities often stood underutilized and the service provider then lost money by paying for overprovisioned capacities. Hence, it would make sense to have a system, which could help manage and scale power and cooling in accordance with changing dynamic IT requirements.
Foreseeing such a scenario, data center service providers are actively looking at DCIM solutions to help effectively manage dynamic data center environments. For instance, according to Soni, Tulip Telecom, whose data centers house close to 3.5 lakh servers spread across 12,000 racks, has evaluated DCIM solutions from HP, IBM and BMC Software, tested, and is close to deploying one of the solutions.
Adding a user perspective to this Dhir of Lanco Infratech said, “We have one of our data centers running on a managed services model where, besides IT, even power and cooling is managed and monitored by a leading service provider. The demands of our business are high and we wanted to consume IT in a hands-off manner and not get down to running and managing IT. ”
Dhir added that going the managed services way for data center management made more sense as it was cost-effective and the service providers had all the processes and procedures in place. In some cases, policies and processes are better implemented by service providers as compared to in-house efforts.
In conclusion, organizations using DCIM would also need to revisit the team strategies for managing their data centers. With DCIM consolidating many of the data center management functions into a single hub, teams and skill-sets would need to be combined and data center managers would need to acquire additional data center management skill sets. IT systems managers would need to be fairly well acquainted with BMS while facilities managers would be expected to have some level of IT systems management proficiency.
From an adoption point of view, large scale data center owners and service providers would lead the adoption curve while others would follow suit once the solutions offer enough savings on data center costs such that the solution starts paying for itself.
harshal.kallyanpur@expressindia.com