Abiquo 5.0

Skip to end of metadata
Go to start of metadata

Introduction

A single Abiquo installation can handle multiple datacenters, as shown in the following diagram:

To scale up the cloud service you can add additional server appliances. This way, the API load will be distributed among several server appliances. The following diagram shows the previous environment with multiple server appliances:

 

Abiquo allows using load balancers in front of multiple server appliances to distribute the requests among them, thus supporting more concurrent requests and providing fail-over capabilities.

To configure multiple API nodes we need a data node. Data nodes contain the services that Abiquo uses to communicate and store the platform data:

Note that you can install the services on different data nodes. When the data nodes are ready, you can deploy as many servers and/or remote services appliances as desired to improve performance and fault tolerance. Abiquo will internally distribute all event processing through the API nodes. However, external API requests will have to be distributed by installing a Load Balancer (i.e. Apache) over this configuration.

The following diagram shows how the load-balanced APIs will require access to the same abiquo.properties configuration and the same shared 3rd-party services.  


To provide fault tolerance we have defined a Leader Election recipe. The recipe ensures that at all times one of the API nodes is the Leader, which deals with the events sent from any module in our platform. User requests are balanced and distributed to any other API nodes (or even the leader itself). We use Zookeeper to always keep track of all live API nodes and always select one, and only one, Leader to make sure all the events are processed.

Configuration

See the Abiquo DevOps website for practical information on configuring API load balancing: Proxy and Load Balancing Configuration

Server appliances

All server appliances must have a common abiquo.properties file to ensure all of them will access the proper information. These are the common properties that all the API nodes must be configured to use:

Server appliances shared properties
# RabbitMQ
abiquo.rabbitmq.username=RABBITMQ_USER
abiquo.rabbitmq.password=RABBITMQ_PASS
abiquo.rabbitmq.addresses=RABBITMQ_HOST1:RABBITMQ_PORT1,...,RABBITMQ_HOSTn:RABBITMQ_PORTn

# Redis
abiquo.redis.port=REDIS_PORT
abiquo.redis.host=REDIS_HOST

# Zookeeper
abiquo.api.zk.serverConnection=ZK_HOST:ZK_PORT

These properties are marked in green in the Abiquo Configuration Properties documentation. You will have to configure the API jdbc connector also to point to the MySQL database. See How to set up a remote MariaDB server

Note on API Leader concept

Tasks between the server and remote services instances are handled with RabbitMQ. All APIs are able to process requests from clients and queue asynchronous tasks to remote services and the API itself. The API leader node is the only one that consumes from the scheduler queue (because the requests are to be processed one by one) and the remote services response queues (because all of the messages must be consumed and processed in order).

To guarantee that there will always be one leader we use Curator framework. This is a well known and widely adopted solution that guarantees there will only be one leader, or no leader (if no API is up and running).

In a worst-case scenario, when the leader fails while processing a message, another leader will be elected and continue with the job. Asynchronous jobs in the leader API will take care of the message left behind.

Abiquo Scheduler

Checking the current leader

When a node takes the leadership will print in api.log

INFO c.a.a.w.l.LeadElectionContextListener - Current API is the /api/leader-election leader

 All the API participants are registered in the /api/leader-election path:

$ zktreeutil -z 10.60.1.5:2181 -p /api -D
...

/
|   
|--[api]
|   |   
|   |--[leader-election]
|       |   
|       |--[_c_151aa53a-1e97-4c96-b4f6-ac7e70e36bef-lock-0000000009 => eruiz/10.60.1.228]
|       |   
|       |--[_c_f6af1685-0da8-4a36-88b2-1ceba2ff15f6-lock-0000000008 => apuig/10.60.1.223]
|   
|--[zookeeper]
    |   
    |--[quota]


The current leader is the registered node with the lowest lock value.

0000000008 < 0000000009 -->  ''apuig/10.60.1.223''

The znode content is the node hostname, so it is important to configure it to avoid useless localhost/127.0.0.1.

Basic API Leader Example

For example, for the asynchronous deployment task of a virtual machine, there are three jobs:

  1. Schedule: Selecting the physical machine and reserving resources.
  2. Configure: Create the virtual machine in the hypervisor.
  3. Power on

In a multiple-API environment, any API can queue the scheduling task. The leader will process the scheduling task. The leader then queues the deploy task (configure and power-on jobs) in the virtual factory. When the VF completes each job, the result is put in the datacenter notification queue. 

Addition of a API node to a running cluster

No specific configuration is required to add a node to a running cluster. Just replicate the properties in the node, and configure the load balancer used in the environment.

Remote Services appliances

All remote services servers in datacenters with API load balancing must be configured to use the same RabbitMQ instance.

The Remote Services Servers only communicate with the datacenter notifications queue in the datanode RabbitMQ instance. DO NOT change the Redis properties for the datacenters.

Remote services appliances shared properties
# RabbitMQ
abiquo.rabbitmq.username=RABBITMQ_USER
abiquo.rabbitmq.password=RABBITMQ_PASS
abiquo.rabbitmq.addresses=RABBITMQ_HOST1:RABBITMQ_PORT1,...,RABBITMQ_HOSTn:RABBITMQ_PORTn

Limitations or clarifications

  • The load balancing is focused on a central node. No load balancing is applied to Remote Services, which are not a bottleneck because they have a RabbitMQ instance to manage requests.
  • All API instances must be configured with the same values in abiquo.properties.
  • Multiple Node Installation : since the technologies used in our datanode (MariaDB, Redis, Rabbit, Zookeeper)  are widely known and properly documented in their own project homepages, all issues related with balancing, sharding, replication, etc affecting them are up to system administrators.  We currently do not provide any support on how to install or configure these systems with replication or balancing features.