GCP — Maintain Scalability and High Availability using Compute Engine

8 min readFeb 3, 2021

1. Introduction

This article explains about to handle Scalability and High Availability using an HTTP Load Balancer in Google Cloud Platform (GCP).

GCP has many in-built pre-defined components for every requirement. Here few components are taken to enable scalability and high availability, those are VM instances, Instance templates, Instance Groups, Health Checks, and a Firewall.

2. Case Study

One simple case study has been taken for this requirement. Install the Apache HTTP Server in VM Instance at GCP, it can be a single region or multiple regions. This server should be invoked through more number of HTTP requests from the same region or a different region.

During this scenario, the system should dynamically either scale-up or scale-down the VM instance to maintain consistency of performance. Scaling up or down should happen with help of the HTTP Load Balancer when either CPU utilization has reached the threshold limit or crossed the configured Request per Second (RPS) in VM Instance.

3. Steps

GCP provides simple steps to achieve this case study, those steps as

a. Create Firewall

b.Create VM Instance Template

c. Create Health Check

d. Create VM Instance Group

e. Create HTTP Load Balancer

f. Load Test in SSH Client

Each step has been explained one by one in the following sections.

4. Create Firewall

Login into the GCP console and go to VPC network => Firewall menu to create the new firewall. On top of the default value in the new firewall screen, provide the following value to create the new firewall

Name : firewall-rule-access-all-port

Target tags: access-all-port

IP ranges: 0.0.0.0/0

Protocols and ports: Allow all

Here Target Tag value will be used in the VM instance Template to allow all types of requests into the VM thru the firewall. Generally, requests can be restricted based on protocol/port for each ingress/egress traffic.

Refer to the screenshot for reference.

5. Create VM Instance Template

In the GCP Console, go to Compute Engine è Instance templates and click create new Instance Template with the name of instance-template-1. On top of the default value, provide the following value to create a new VM instance Template.

Firewall: Select Allow HTTP traffic

Management Tab: Add the below-mentioned startup script under Automation to install the Apache server while create the new VM instance using this template.

#! /bin/bash

sudo apt-get update

sudo apt-get install -y apache2

Networking Tab: Add “access-all-port” firewall tag in Network Tags. This tag was created in the new Firewall (Refer to the Create New Firewall section)

Only these 3 changes are required to create a new VM instance Template. Refer to the screenshot to create.

6. Create Health Check

Go to Compute Engine ==>Instance Groups ==> heal Checks, click create Health Check to monitor the TCP and HTTP protocol with port no 80.

Create a new health check with the name of health-check-1 for TCP, here there is no change in the default value. For HTTP, create another new Health Check with the name of health-check-2, here must change the protocol to HTTP instead of TCP.

These 2 Health Checks will be used in Instance Group and Load Balancer — Backend Service to continue monitoring the server availability.

7. Create VM Instance Group

Instance Group helps to create the instance dynamically based on CPU utilization and Request per Second (RPS). It can be configured in a single zone or multiple zones; a multiple region option is not available. Create an Instance Group in more than one region to handle the performance across the regions. These Instance Groups will be configured in Load Balancer which handles the request from global. Will have more details in the Load balancer section.

In Console, go to Compute Engine è Instance Groups è Instance Groups. Create the Instance Group with the name of instance-group-1.

On top of the default value, provide the following value to create a new VM Instance Group.

Location: Change into Multiple zones and select the preferred Region

Instance Template: Choose the newly created instance template from the dropdown

Auto Scaling Metrics: CPU utilization of 60% was set in the default value. It can be modified. Another 2 metrics also can be added, but only CPU metrics have been taken for this case study.

Maximum number of instance: set as 4 since the GCP trial version has some limitation

Health Check: select the TCP health check which was created recently

Refer to the screenshot to create the Instance Group.

8. Create HTTP Load Balancer

Load Balancer helps to divert the request into a different server, it may be within a region or different region as based on server availability.

In LB, the number of instance groups can be configured in the Backend configuration but only one static IP should be configured in the Frontend configuration. It means, one IP will receive all the HTTP requests, but each request can be processed by a different server. Here LB having the intelligence to find the server which is ready to process and forward the request.

There are few steps that need to be configured to enable LB, below here explained in detail

In the GCP Console, go to Network services ==> Load balancing menu and click create Load Balancer. GCP provides three types of load balancing which as HTTP, TCP, and UDP load balancing. Here should select HTTP load balancing for this case study and then select the “From the internet to my VMs” radio button to proceed further.

Enter the Load Balancer name of load-balancer-1 and click backend configuration

a. Backend Configuration

Click the Create a backend service option from the dropdown

Enter the backend service name of backend-service-1, on top of the default value, add the new backend, here select the instance group name, and set the balancing mode either CPU utilization or Rate. More than one instance group can be configured for distributing the request into the different servers which help to handle the scalability and high availability.

Chose the HTTP Health Check which was created for Load Balancing and create the backend Configuration.

b. Host and Path Rules

In Host and path rules, select the newly created backend service

c. Frontend Configuration

Enter the frontend service name of front-end-service-1, on top of the default value, create the new static IP address for Load Balancer. One popup will appear when clicking the “Create IP address”, here provide the proper name and reserve the IP for this load balancer.

d. Review and Create

Finally, review the details and create the load balancer, it will take few minutes to create.

e. Monitoring

After successfully create the LB, it will appear with 3 tabs to check the status of the backend and frontend. You can explore each tab and get to know the further in detail.

Click the load balancer name to monitor the overall status and individual instance group status

9. Load Test using SSH Client

Servers are created in multiple regions and these are configured in Load Balancer for distributing the request into various servers based on availability. Now we are going to apply the load test to check the system behaviors.

In Console, go to Compute Engine ==> VM Instance. Create one VM with the default value.

Open the SSH client which appears in the VM instance list screen.

In SSH client, set the load balancer IP address in ENV variable using following script

export LB_IP=<<LB IP Address>> (Ref: reserved static IP address was created in frontend configuration)

Apache2-utils package is used for the load test. Execute the following scripts in SSH to install the package and place the more load on server

sudo apt-get apache2-utils

sudo apt-get install apache2-utils

ab -n 500000 -c 1000 http://$LB_IP/

Initially, the request goes to the American region since the load test VM was created in the same region. So LB finds the nearest server and sends the request until CPU utilization or RPS across the threshold limit.

Once it crossed the threshold limit as CPU unitization is greater than 60% in all servers at instance-group-1, then LB finds the alternate server in a different region to forward the request. The same way LB handles all the request and maintain the performance consistency.

LB taking the responsibility to forward the request to the instance group. Here instance group scaling-up or scaling-down based on load and predefined configuration.

10. Conclusion

As per the case study, the request has been distributed to the various servers either within the same region or a different region. It will be happened based on server load and Load Balancer intelligence. GCP provides many simplest options to handle the scalability and high availability. Here Compute Engine features only taken to achieve this case study.

Hope now you have learned about how to use the Load Balancer and Instance Group in GCP to maintain scalability and high availability.

The same requirement can be achieved in a much more simple way in GCP Kubernetes Engine. It will be explained in the next article.