We live in a society in which matters such as shortage of energy, global warming and effects of greenhouse gases have made environmental care a priority both for government and companies and the society as a whole. Information Technologies (IT) cannot remain alien to this challenge and all the agents involved in their development, implantation or utilization have started to consider all the possible efforts to mitigate the impact of IT in the environment. Today, energy consumption is a critical question for the IT organizations, either to reduce costs or to preserve the environment.
Computer systems are becoming increasingly ubiquitous and a part of the national infrastructure of every country, resulting in large installations of computer systems to provide critical services. These installations, normally referred to as data centers, have grown to require significant levels of electrical power, thereby contributing to the consumption of fossil fuels and generation of green house gases.
Data transmitted on corporate networks has increased ten-fold in the past years
and real time access to information through a myriad of business applications
has become a distinguishing competitive advantage for several enterprises. This
coupled with new requirements like compliance that necessitate storage of consumer
data for longer periods of time as well as business continuity and disaster
recovery, have elevated the demand for server farms (data centers). High capital
investment and lengthy delivery cycles are prohibitive for constructing captive
data centers. Data centers are the building blocks of any IT business organization,
providing capabilities of centralized storage, backups, management, networking
and dissemination of data in which the mechanical, lighting, electrical and
computing systems are designed for maximum energy efficiency and minimum environmental
impact (Kant, 2009). Data centers are found in nearly
every sector of the economy, ranging from financial services, media, high-tech,
universities, government institutions and many others. They use and operate
data centers to aid business processes, information management and communication
functions (Daim et al., 2009). Due to rapid growth
in the size of the data centers there is a continuous increase in the demand
for both the physical infrastructure and IT equipments, resulting in continuous
increase in energy consumption. Data center IT equipment consists of many individual
devices like Storage devices, Servers, chillers, generators, cooling towers
and many more. Servers are the main consumers of energy because they are in
huge number and their size continuously increases with the increase in the size
of data centers. This increased consumption of energy causes an increase in
the production of green house gases which are hazardous for environmental health
and global warming.
Virtualization technology allows IT administrators to run multiple independent
virtual servers on a single physical server, thereby enabling optimal server
resource utilization, is being touted as a key solution for improving the space
and energy efficiency of data centers. The average utilization ratio of many
hardware devices installed in data centers for performing processing is around
30% or even less in fewer environments. Virtualization technology increases
the utilization ratio up to 80% by properly managing the workloads between servers
performing the processing. It also allows users to free up excess server capacity
quickly without costly expansion, relocation, or new construction. Virtualization
technology is now becoming an important advancement in IT especially for business
organizations and has become a top to bottom overhaul of the computing industry.
It combines or divides the computing resources of a server based environment
to provide different operating environments using different methodologies and
techniques like hardware and software partitioning or aggregation, partial or
complete machine simulation, emulation and time sharing (Singh,
It enables running two or more operating systems simultaneously on a single
machine. Virtual Machine Monitor (VMM) or hypervisor is a software that provides
platform to host multiple operating systems running concurrently and sharing
different resources among each other to provide services to the end users depending
on the service levels defined before the processes (Green
Grid, 2009). Virtualization and server consolidation techniques are proposed
to increase the utilization of underutilized servers so as to decrease the energy
consumption by data centers and hence reducing the carbon footprints (Green
Grid, 2009). This paper will explain some of the necessary requirements
to be fulfilled before implementing virtualization in any firm.
In recent years, IT infrastructure continues to grow rapidly driven by the demand for computational power created by modern compute intensive business and scientific applications. However, a large-scale computing infrastructure consumes enormous amounts of electrical power leading to operational costs that exceed the cost of infrastructure in few years. Ensuring a secure energy supply, preserving the environment and protecting the climate are central challenges facing todays world. Environmentally friendly technologies are the key to sustainable economic activity. In order to optimize the use of resources across the entire spectrum of global value chains, it is essential to tap the full potential of technology.
The commercial, organizational and political landscape has changed fundamentally
for data center operators due to a confluence of apparently incompatible demands
and constraints. The energy use and environmental impact of data centers has
recently become a significant issue for both operators and policy makers. Global
warming forecasts that rising temperatures, melting ice and population dislocations
due to the accumulation of greenhouse gases in our atmosphere from use of carbon-based
energy. Unfortunately, data centers represent a relatively easy target due to
the very high density of energy consumption and ease of measurement in comparison
to other, possibly more significant areas of IT energy use. Policy makers have
identified IT and specifically data center energy use as one of the fastest
rising sectors. At the same time the commodity price of energy has risen faster
than many expectations. This rapid rise in energy cost has substantially impacted
the business models for many data centers. Energy security and availability
is also becoming an issue for data center operators as the combined pressures
of fossil fuel availability, generation and distribution infrastructure capacity
and environmental energy policy make prediction of energy availability and cost
difficult (Rivoire et al., 2007).
As corporations look to become more energy efficient, they are examining their
operations more closely. Data centers are found a major culprit in consuming
a lot of energy in their overall operations. In order to handle the sheer magnitude
of todays data, data centers have grown themselves significantly by continuous
addition of thousands of servers. These servers are consuming much more power
and have become larger, denser, hotter and significantly more costly to operate
(McNamara et al., 2008). An EPA Report to Congress
on Server and Data Center Energy Efficiency completed in 2007 estimates that
data centers in USA consume 1.5% of the total USA electricity consumption for
a cost of $4.5 billion (EPA, 2007). From the year 2000
to 2006, data center electricity consumption has doubled in the USA and is currently
on a pace to double again by 2011 to more than 100 billion kWh, equal to $7.4
billion in annual electricity costs (Tung, 2008). Gartner
group emphasizes on the rising cost of energy by pointing out that, there is
a continuous increase in IT budget from 10% to over 50% in the next few years.
Energy increase will be doubled in next two years in data centers (Gartner,
2009). The statistics Cleary shows that the yearly cost of power and cooling
bill for servers in data centers are around $14 billion and if this trend persists,
it will rise to $50 billion by the end of decade (EPA, 2009).
A data centre represents an area with a large concentration of electronic equipment
and power density within a limited space (Lefurgy et
al., 2003). Data centers are one of the largest energy consumers, accounting
for approximately 2% of total global energy use (Koomey,
2009). While the demand for and on data centers continues to increase, these
digital powerhouses are faced with several power, cooling, performance
and space constraints associate with environmental, technological and economic
sustainability (Schulz, 2009). Improving the energy efficiency
and environmental performance of data centers is therefore at the forefront
of organizations actions in greening their information technology
The energy consumed by various computing facilities creates different monetary,
environmental and system performance concerns.
|| Data center and server efficiency (Source EPA)
A recent study on power consumption of server farms shows that in 2005 the
electricity use by servers worldwide including their associated cooling and
auxiliary equipments cost US$7.2 bn (Fig. 1).
With the increase in infrastructure and IT equipment, there is a considerable
increase in the energy consumption by the data centers and this energy consumption
is doubling after every five years (Gartner, 2009). Todays
data centers are big consumer of energy and are filled with high density, power
hungry equipment. If data center managers remain unaware of these energy problems
then the energy costs will be doubled between 2005 and 2011. If these costs
continue to double every five years, then data center energy costs will increase
to 1600 % between 2005 and 2025 (Caldow, 2008). Currently
USA and Europe have largest data center power usage but Asia pacific region
is rapidly catching up (Kumar, 2008).
PROPOSED VIRTUALIZATION IMPLEMENTATION TECHNIQUE
There is significant potential for energy efficiency improvements in data centers.
Many technologies are either commercially available or will soon be available
that could improve the energy efficiency of microprocessors, servers, storage
devices, network equipment and infrastructure systems. Still, there are plenty
of unexplored, reasonable opportunities to improve energy efficiency. Selection
of efficient IT equipment and reducing mechanical infrastructure increases the
energy efficiency. Improvements are possible and necessary at the level of the
whole facility, i.e., the system level and at the level of individual components.
It is not possible to optimize data centre components without considering the
system as a whole, still it is true that efficient components are important
for achieving an efficient facility; for instance, efficient servers generate
less waste heat which reduces the burden on the cooling system (Uddin
and Rahman, 2011a). The ability to influence key legislative decisions,
such as auctioning versus grandfathering allowances will enable companies to
position for competitive advantage due to the significant asset value of the
allowances (Uddin and Rahman, 2011b). The first step
in greening data centers is to baseline all the requirements to get the maximum
value out of data center greening program. Now more than ever, energy efficiency
seems to be on everyones minds. Faced with concerns such as global warming
and skyrocketing energy costs, more and more companies are considering if and
how to increase efficiency (Uddin et al., 2011).
Servers are the leading consumers of IT power in any data center. Data centers
are plagued with thousands of the servers mostly underutilized, having
utilization ratio of only 5 to 10%, consuming huge energy and generating huge
amount of green house gases (Uddin and Rahman, 2010a).
This study focuses on the use of virtualization to overcome energy problems
in data centers. In this paper we proposed a five step process to implement
virtualization in a data center to save energy and at the same time increases
the productivity of servers with little or no additional energy consumption.
Virtualization has become popular in data centers since it provides an easy
mechanism to cleanly partition physical resources, allowing multiple applications
to run in isolation on a single server. It categorizes volume servers into different
resource pools depending on the workloads they perform and then server consolidation
is applied. This technique decouples softwares from hardware and splits
multi processor servers into more independent virtual hosts for better utilization
of the hardware resources, allowing services to be distributed one per processor.
In server consolidation many small physical servers are replaced by one large
physical server to increase the utilization of expensive hardware resources,
reducing the consumption of energy and emission of CO2 (Uddin
and Rahman, 2010b). Virtualization technology promise great opportunities
for reducing energy and hardware costs through server and resource consolidation,
live migration, data deduplications and data shrinkage. Virtualization provides
an easy mechanism to cleanly partition physical resources, allowing multiple
applications to run in isolation on a single server. It provides different solutions
to the power and environmental problems faced by data center industry by offering
an opportunity to consolidate multiple underutilized volume servers onto fewer
physical servers, thereby reducing physical and environmental footprint by reducing
physical space required to house them, energy inputs needed to power them and
at the same time saves the overall cost of ownership of data center.
Benefits of virtualization: Virtualization promises to radically transform computing for the better utilization of resources available in the data center reducing overall costs and increasing agility. It reduces operational complexity, maintains flexibility in selecting software and hardware platforms and product vendors. It also increases agility in managing heterogeneous virtual environments. Some of the benefits of virtualization are:
Server and application consolidation: Virtual machines can be used to consolidate the workloads of several under-utilized servers to fewer machines, perhaps a single machine. The benefits include savings on hardware and software, environmental costs, management and administration of the server infrastructure. The execution of legacy applications is well served by virtual machines. A legacy application may not run on newer hardware or operating systems. Even if it does, it may under-utilize the server, hence virtualization consolidates several such applications, which are usually not written to co-exist within a single execution environment. Virtual machines provide secure, isolated sandboxes for running entrusted applications. Examples include address obfuscation. They also provide fault and error containment by isolating applications and services they execute, ultimately providing better behavior of these different faults. Virtualization also provides better platform and opportunity to create secure and independent computing applications.
Multiple execution environments: Virtual machines can be used to create
operating systems or execution environments that guarantee resource management
by using resource management schedulers with resource limitations. Virtual machines
provide the illusion of hardware configuration such as SCSI devices. Virtualization
can also be used to simulate networks of independent computers. It enables to
run multiple operating systems simultaneously having different versions, or
even different vendors sharing and executing the workloads o different applications
being processes. Virtual machines allow powerful debugging and performance monitoring
tools that can be installed in the virtual machine monitor to debug operating
systems without losing productivity.
Software and hardware migration: Virtual machines aid application and system mobility by making possible the migration of software and hardware between different virtual machines to share and properly utilize the resources available. Large application suites can be treated as appliances by "packaging" and running each in a virtual machine. Virtual machines are great tools for research and academic experiments, as they provide isolation and encapsulate the entire state of a running system. Since we can save the state, examine, modify and reload it. Hence it provides an abstraction of the workload being run. Virtualization enables the existing operating systems to run on shared memory multiprocessors. Virtual machines can be used to create arbitrary test scenarios and thus lead to very imaginative and effective quality assurance. The resource sharing process is the major advantage of virtualization that manages and allocates resources properly according to the demand and requirement from applications and processes being executed, saving a lot of energy being consumed when resources are not properly utilized and remain idle and underutilized most of the time.
Manageability and reusability: Virtualization can be used to retrofit new features in existing operating systems without too much overhead. It can make tasks such as system migration, backup and recovery easier and more manageable. It also provides an effective means of binary compatibility across all hardware and software platforms to enhance manageability among different components of virtualization process.
Virtualization implementation process: Before implementing server virtualization in any firm it is important to seriously plan and consider virtualization risks associated with it. It is also important for the data center to check whether it has the necessary infrastructure to handle the increased power and cooling densities arise due to the implementation of virtualization. It is also important to consider the failure of single consolidated server, because it is handling the workload of multiple applications. In order to properly implement virtualization there is a need to answer some of the questions:
||What is virtualization?
||Why we need it?
||How it can improve our businesses?
||Types of virtualization technologies exist?
||What is cost/benefit ratio of virtualization?
||What new challenges it will bring to the business firm?
||Structure of the virtualization solution being implemented?
||Which applications or services are good virtualization candidates?
||Which server platforms best suited to support virtualization?
Virtualization technology is now becoming an important advancement in IT especially
for business organizations and has become a top to bottom overhaul of the computing
industry (Uddin and Rahman, 2011a). Like any other IT
project, virtualization implementation must also be structured and designed
in such a way that they must fulfill the necessary requirements and should be
within the infrastructure domain already installed. It is much more than simply
loading a virtualization technology on a server and transforming one or two
workloads into virtual machines. Virtualization implementation involves different
key steps to be followed to properly implement and manage virtualization so
that desired objectives can be fulfilled. These phases help to provide a detailed
description about the necessities fulfilled before implementation.
||Apply and implement virtualization
||Hardware and software maximization
||Architecture and management
Innovation process: The process of virtualization starts by creating an inventory of all servers, applications, resources required by servers, available resources and their associated workloads, this process is called innovation phase (Fig. 2). It includes both utilized and idle servers and their associated resources and workloads. It also includes information related to:
||Make and model of the processor
||Types of processors (socket, Core, Threads, Cache)
||Memory size and speed
||Network type (Number of ports, speed of each port)
||Local storage (number of disk drives, capacity, RAID)
||Operating system and their patch levels (service levels)
|| Phases of virtualization implementation process
The Innovation process identifies and analyzes an organizations network before it is being virtualized. It consists of following sub phases.
Inventory process: Data centers are huge entities consists of many different
components and devices performing different tasks to meet end user and business
needs. These components should be categorized into measureable resource pools
depending on the workloads they execute. The innovation process identifies and
classifies all the components of a data center and classifies them according
to different parameters like energy use, carbon emission, utility ratio, type
of equipment, life time etc. It is therefore important for a data center to
know in advance the total content of its infrastructure before implementing
virtualization. There are many tools available from different vendors for performing
initial analysis of an organization. Microsoft Baseline Security Analyzer (MBSA)
tool provides different information like IP addressing, Operating System, installed
applications and most importantly vulnerabilities of every scanned system. After
analyzing, all generated values are linked to MS Visio, which generates a complete
inventory diagram of all components and also provides details about each component
being analyzed. Microsoft Assessment and Planning toolkit (MAP) is another tool
for the assessment of network resources. It works with Windows Management Instrumentation
(WMI), the remote registry service or with simple network management protocol
to identify systems on network. VMware, the founder of X-86 virtualization,
also offers different tools for the assessment of servers that could be transformed
into virtual machines. VMware Guided Consolidation (VGC) a powerful tool assesses
network with fewer than 100 physical servers. Since, VGC is an agent less tool
it doesnt add any overhead over production servers workload (Dincer,
1999). Some of the major data center components are listed below:
||Uninterruptible power supplies (UPS)
||Computer room air conditioners
||Direct expansion (DX) units
||Distribution losses external to the racks
||Power distribution units (PDUs)
Classify servers: With the recent development and growth in the size
these server farms, the number of servers continuously increases as the demand
for networking, storage, speed, backups and recovery and computation increases.
These servers consume a lot of energy and power to perform processing, hence
generate too much CO2 and their utilization ratio is also low. In
an average server environment, 30% of the servers are dead only
consuming energy, without being properly utilized (Uddin
and Rahman, 2011b).
In a traditional data center, each server is devoted to a specific function. For instance, an e-mail server deals only with e-mail and a payroll server handles only payroll. But this traditional way is inefficient because the e-mail server could only run at 65% capacity, for instance, during business hours to accommodate spikes in demand. And the same server would uses significantly less energy during non business hours. The payroll server, on the other hand, might run at only 5% capacity during business hours as a few changes and queries are processed by personnel, holding the remainder of its capacity in reserve for the larger job of payroll processing after hours. Using virtualization, the e-mail server and payroll server could share the same machine, with e-mail processing using the bulk of the capacity during business hours and payroll processing using the bulk during off hours. Using this method, governments purchase and maintain less equipment. They also save on the cost of housing, powering and cooling huge server farms that only use a fraction of their processing power. Bringing all of a governments processing needs together under one roof can bring immense efficiency and cost benefits to the government organization.
Servers are categorized according to associated resources and workloads they perform into different resource pools. This process is performed to avoid any technical political, security, privacy and regulatory concern between servers, which prevent them from sharing resources. Once analysis is performed, we can categorize each server roles into groups. Server roles are categorized into following service types:
||Network infrastructure servers
||Identity management servers
||File and print servers
||Dedicated web servers
Categorize application resources: After categorizing servers into different resource pools, applications will also be categorized as:
||Commercial versus in house applications
|| Legacy versus updated applications
||Support to business applications
||Line of business applications
||Mission critical applications
Allocation of computing resources: After categorizing servers, applications
and their associated workloads, the next process is to allocate computing resources
required by these different workloads and then arranging them in normalized
form, but for normalization the processor utilization should be at least 50%.
It is very important to normalize workloads so as to achieve maximum efficiency
in terms of energy, cost and utilization. The formula proposed in this study
for normalization is to multiply utilization ratio of each server by total processor
capacity that is (maximum processor efficiencyxNo. of processorsxNo. of cores).
Apply and implement virtualization: After analyzing and categorizing
servers, applications and associated workloads, the next step defines virtualization
in detail, its advantages, its types, layers and most importantly vendor identification.
Virtualization is the faithful reproduction of an entire architecture in software
which provides the illusion of a real machine to all software running above
it. Precisely, virtualization is a framework or methodology of dividing the
resources of a computer into multiple execution environments, by applying one
or more concepts or technologies such as hardware and software partitioning,
time-sharing, partial or complete machine simulation, emulation, quality of
service and many others. This can be applied by either software or hardware
or both and also for desktop computer as well as for the server machine.
To properly implement virtualization we presented a model called layered virtualization
implementation model consisting of five layers (Uddin and
Rahman, 2011a). Each layer defines more detailed processes to provide a
detailed treatment of state of the art and emerging challenges faced by data
centers managers to implement and manage virtualization properly in their data
centers to achieve desired objectives. The proposed model defines that, the
process of virtualization should be structured and designed in such a way that
it must fulfill the necessary requirements and should be within the scope and
infrastructure domain already installed in the data center. It is therefore
much more than simply loading a virtualization technology on different servers
and transforming one or two workloads into virtual machines. Rather it is a
complex and rigorous process that need to be implemented and monitored properly
(Uddin and Rahman, 2011b).
In software only virtualization technique, a Virtual Machine Monitor (VMM)
program is used to distribute resources to the current multiple threads. But
this software only virtualization solution has some limitations. One is allocation
of memory space by guest operating systems where applications would conventionally
run. Another problem is binary translation, i.e., the necessity of extra layer
of communication for binary translation, in order to emulate the hardware environment
by providing interfaces to physical resources such as processors, memory, storage,
graphics cards and network adapters. Hardware virtualization technique is a
good solution to face the above problems which works in cooperation with VMM.
This virtualization technique provides a new architecture upon which the operating
system can run directly, it removes the need for binary translation. Thus, increased
performance and supportability ensured. It also enhances the reliability, supportability,
security and flexibility of virtualization solutions.
VMware Capacity Planner (VCP) tool can be used when network size extends over
100 physical servers. It generates reports on server processor utilization including
CPU, Memory and network and disk utilization on server by server basis and finally
identifies potential virtualization candidates. Other tools like CIRBAs
Power Recon and Plate Spins are also very useful tools which analyze technical
and non-technical factors in data centers and generate reports for the consolidation
of servers (Koomey, 2008). It should be noted that all
these analysis should be run on time for a period of at least one month, this
will generate high and low utilization ratios for each server.
Hardware maximization: Servers are the major consumers of energy as
they are in huge quantity and perform most of the processing to run multiple
virtual workloads. It is important to consider hardware issues because already
available hardware is not enough and suitable for providing high availability
of virtual workloads. A change is required to install new hardware that supports
and delivers the best price and performance. This process ensures high availability
of virtual workloads and also provides leaner and meaner resource pool of resources
for these virtual workloads. One of the major issues in hardware maximization
is the proper utilization and availability of RAM for each virtual machine.
For this reason it is important to consider 64 bit architecture, which provides
more utilization and availability of RAM for all virtual and physical systems.
It is also important to consider single point of failure because one server
is now running the workloads of multiple servers. If this server goes down the
whole process of virtualization becomes fail. To remove the chances of single
point of failure at any stage can be achieved by using redundancy and clustering
services to protect virtual workloads. These services are mostly provided by
Microsoft and Citrix. While VMware on the other hand uses custom configuration
approach called High Availability (HA).
Architecture and management: The architecture of a machine consists
of set of different instructions that allow inspecting or modifying machine
state trapped when executed in any or most probably the privileged mode. To
support proper hardware utilization, it is important to update and revise whole
data center architecture. To protect virtual workloads, x-64 systems should
be linked to shared storage and arranged into some form of high availability
clusters so as to minimize the single point of failure. This is the most critical
time consuming and painful operation when performed manually, since it includes
cloning existing operating system and restoring it on an identical machine,
but at the same time changing the whole underlying hardware, which can lead
to driver reinstallation or possibly the dreadful blue screen of death. To avoid
these ambiguities, virtualization vendors started to offer different Physical
to Virtual (P2V) migration utilities. This utility software speeds up the movement
of operation and solves on the fly driver incompatibilities, by removing physical
hardware dependencies from server operating systems and allowing them to be
moved and recovered. Instead of having to perform scheduled hardware maintenance
at some obscure hour over the weekend, server administrators can now live migrate
a virtual machine to another physical resource and perform physical server hardware
maintenance in the middle of the business day.
Virtualization process should be properly managed and supervised from time
to time to better understand the relationships between the components of virtualization
and process as a whole to achieve desired objectives. It is a continuous process
implemented in two phases. In the first phase all the activities and requirements
needed for proper implementation of virtualization should be monitored and managed
properly from time to time. In the second phase the process of management should
continues even after the implementation of virtualization to properly manage
the applications, their workloads, processes and the resources available to
meet those objectives. It is important to note that conversion should always
be preferred when servers are offline to protect existing services and maintain
Service Level Agreements (SLA) with end users.
CONCLUSION AND RECOMMENDATIONS
This study highlights the importance of virtualization technology being implemented
in data centers to save the cost and maximize the efficiency of different resources
available. We proposed a four phase strategy to properly implement virtualization.
It starts by categorizing servers and their associated applications and resources
into different resource pools. It is important to consider that virtualization
not only needs to characterize the workloads that are planned to be virtualized,
but also target the environments into which the workloads are to be applied.
It is important to determine the type of servers, their current status whether
idle or busy, how much it will cost to implement server virtualization, the
type of technology needed to achieve the service levels required and finally
meet the security/privacy objectives. It is also important for the data center
to check whether it has the necessary infrastructure to handle the increased
power and cooling densities arise due to the implementation of virtualization.
It is also important to consider the failure of single consolidated server,
because it is handling the workload of multiple applications.
Virtualization poses many challenges to the data center physical infrastructure like dynamic high density, under-loading of power/cooling systems and the need for real-time rack-level management. These challenges can be met by row-based cooling, scalable power and predictive management tools. These solutions are based on design principles that simultaneously resolve functional challenges and increase efficiency.