The Open Source Cloud

Components of the Cloud
The cloud as a whole is very complex, with several open source projects taking on its various aspects. In general we can break down the cloud complexities to a few core components:

1. Virtualization
2. Network Services
3. Configuration Management
4. Storage
5. Monitoring
6. Communication / remote execution

Each of these components are used to facilitate the cloud platform for specific service and infrastructure requirements. Identifying the role of several open source cloud projects will help clarify where each project comes into play, and their scope within the overall open source cloud movement.


Virtualization is based on the desired hypervisor, and there are many available. The selection of a virtualization layer should be dependent on the application that is being supported. For instance, the security, transience and performance are considered and applied relative to the OS being used to support the service.

Virtualization is the core of the cloud idea, and is what formed the recent “Cloud” hype. Working with the virtualization layer is the “Cloud Controller”. The Cloud Controller is a central system that controls the virtual machines on many servers. Cloud controllers communicate with the hypervisors and vms to manage virtual machine operations. A cloud controller usually has means to set up new vms, destroy vms, migrate vms, and report on the location of vms.

A list of Open Source virtualization cloud controllers include:

1. Openstack – cloud platform developed by Rackspace and NASA, with some 135+ contributing companies.
2. WSO2 Stratos – a Platform as a Service (PaaS) with extensive support for core services.
3. Xen Cloud Platform – enterprise level cloud platform featuring the Xen Hypervisor.
4. Eucalyptus – cloud computing software platform for private IaaS clouds.
5. Nebula – cloud computing platform for NASA scientists and researchers.
6. OpenNebula – IaaS cloud computing platform.
7. Cloud Foundry – open source cloud Platform as a Service.
8. OpenQRM – data center management platform.
9. Linux KVM – full virutalization solution for Linux.

Network Services

These are the classic services that set up things like DNS and DHCP. When selecting these services, redundacy and scalability need to be considered. This is why tools like dnsmasq relegate themselves to the “home network” while tools like Bind and ISC DHCPD are usually used to supply enterprise and cloud network services. Network switches in a cloud solution can be virtualized or physical.

Open source network services solutions include:

1. Quantum – an incubated project from OpenStack to provide Networking as a Service (NaaS).

Configuration Management

This is the method used to configure servers for deployment. There are many means and services out there today to facilitate this. These range from manually setting up servers to pre-defining a server image to using a configuration management system.

Traditionally configuration management was done with image definition, but most infrastructures today opt for configuration management systems.

Open source configuration management systems include:

1. Puppet – configuration management automation.
2. Chef – systems integration framework built specifically for automating the cloud.
3. CFEngine of Salt States – configuration interface for the salt states.


Most enterprise infrastructures require large amounts of storage. The storage systems are distinguished by a number of factors, primarily size vs speed. Developing a large distributed filesystem or distributed datastore (depending on the service, sometimes you want both) that is slower to write to but still fast to read from is the most cost effective way to build out a large storage backend. These systems work well in almost all situations outside of a high speed database.

Open source projects that facilitate these types of services include:

1. GlusterFS – distributed filesystem, recently purchased by Red Hat.
2. MooseFS – A central distributed file system – fast reliable and resiliant (me personal favorite)
3. Ceph – A promising distributed filesystem built on the latest filesystem technology – btrfs, and is still under heavy development.
4. Hadoop – distributed data store backed by many corporations
5. Swift – OpenStack data store


This is a traditional service used to monitor the health of an infrastructure. These have been around for a long time, one of the oldest examples of this is Nagios. Some of the leading Monitoring and alerting systems are:
1. Nagios – IT infrastructure monitoring.
2. Zabbix – monitoring system.
3. Zenoss – cloud management system.

Communication / Remote Execution

The remote execution component is often under estimated. Many tools use an ssh based remote execution system. The goal of a remote execution system is to have a means to execute commands on large numbers of servers without having to log into them one at a time.

This allows for a generic interface to collect data from certain servers and run routine commands. Some available applications for this scope are:

RunDeck – automates ad-hoc and routine procedures in data center and cloud environments.
MCollective – a framework to build server orchestration or parallel job execution systems.
Salt – fast and simple remote execution system.
Func – a two-way authenticated system for tools, systems and programs to communicate.

Please contact us if you have information that can help improve this listing of various open source cloud projects.

3 Responses

  1. Alysha 5 years ago
  2. Namtr0 5 years ago
  3. Alex 5 years ago

Add Comment