An introduction to circuit breaker design pattern with a guide for using it practically in day-to-day development

If your home has been powered by electricity, you have definitely heard about “Breakers” or “Breaker Switch”. What does breaker do for us in the home grid? Actually, when there is a short-circuit or leakage the breaker switch will open and electricity will cut down. It will save our home and all appliances from that kind of damages. We can manually close the switch once the circuit came back to normal. This is the fault-tolerant strategy in our home grid.

This image is recreated by agnasarp.com and used only for information purposes. This does not have any bind with any real-world products or services. Image source: pixabay.com

Likewise, we should definitely have a mechanism for fault tolerance in our software applications. Here we are mainly focusing on fault tolerance in microservices. Today’s software systems are moving to microservices architecture from monolithic architecture. We will create a number of granular independent microservices that can easily maintain in this architecture. There will be different service consumers like web applications, mobile applications that are calling microservices remotely to serve their customers. We might see even microservices may call other microservices that reside in the same system as well as in different 3rd party systems. More or less we have to face when the calling microservice is unavailable or not responsive. We have to have a successful fault-tolerant mechanism to keep the whole system up and running. The circuit breaker design pattern will come to play as a solution to that problem.

When we talk about a remote service like microservices, we are mainly considering two aspects. They are,

  • Availability of the remote service
  • Responsiveness of the remote service

When it comes to responsiveness, the service consumer can wait until the remote service will return a response. But it is not a good idea to wait for a considerably longer period of time because it will affect badly the end-user experience. We need to set some timeout values inside service consumers for remote services. There may be different ways to pick a timeout value. One method of selecting a value is doubling the average response time. For example, if some remote services have a 2000milliseconds(ms) average response time, we can pick the timeout value is 4000ms.

timeout = average response time x 2

= 2000ms x 2

= 4000ms

Though it seems pretty much OK we will face a huge problem when the remote service creates a resource in a backend layer like databases. Suppose that the underlying remote service is placing orders. If there is a delay in responding the service consumer will timeout the request and prompt ultimately a failure to process the request but the order has been taken place in the background. Since the service consumer is not aware of this, it will again place a new request creating finally a duplicate of the request. There should have been a complex logic to catch and eliminate this kind of duplicate requests. As a quick solution, we can optimize the timeout value further. Instead of double the average response time, we can double the maximum response time that takes to serve the request successfully at some point (max under load). Suppose 5000ms as the max underload the timeout value will be,

timeout = max underload x 2

= 5000ms x 2

= 10000ms

The service consumer will have to wait until 10000ms (10s) to get a response. This is completely unacceptable as it may badly affect the user experience. Then the circuit breaker pattern is the solution for the problem we have.

The circuit breaker is residing between the service consumer and the microservice and it is continuously monitoring the microservice.

  • When there is not anything special, the circuit breaker is in the closed state, and any service request goes to the microservice, and the service consumer gets the expected response. In this scenario, the microservice is available and responsive.
  • If the microservice is down or unresponsive, the circuit breaker detects that and changes its state to open. Unlike the timeout scenario, the service consumer gets a meaningful response with the unavailability of the microservice to serve the request in just a few milliseconds, for example, in 20ms.
  • Meanwhile, the circuit breaker checks the availability of the microservice but it is still unavailable or unresponsive.
  • If the microservice is available, then the circuit breaker will set its status from open to closed and it will route service requests from the service consumer to the microservice and returns the expected responses.

We will look at the function of monitoring the microservice by the circuit breaker. There can be 3 types of monitoring mechanisms, and we can say 3 types of circuit breakers as well for them. They are the simple heartbeat, synthetic transactions, and real-suer monitoring. We will check the capability of each mechanism.

Simple heartbeat

This is really about checking the health of the microservice. When the service is available, the breaker will understand the service is up and running and available to serve but does not have an idea about how long it will take to serve, which means no idea about responsiveness. Therefore this simple heartbeat is not ideal for circuit breaking.

Synthetic transactions

Here, we have to use fake transaction which generates only for testing the microservice whether it is available and responsive. We have to face few problems with this mechanism. One is we can only check very few scenarios. If we apply the strategy to order placing service, we will end up with certain types of orders for these fake transactions so all scenarios will not be covered. On the other hand, we have to request from all downstream systems even from reporting tools not to proceed with these dummy data or payload and that will be really awful.

Real-user monitoring

We use only real transactions for this monitoring. Here it is used trend analysis rather than the timeout. If we suppose there are 10 requests having response time in ms as 1.6, 2.0, 2.3, 1.9, 2.2, 2.3, 2.5, 3.0, 4.5, 5.8. Then there is a trend to increase the response time. Most of the time this analysis will be a standard deviation analysis since there are ups and downs. For example, if the response time is greater than 3 standard deviations from the mean, the circuit will be opened. If the circuit is in this state all the time, then the service consumer is unable to get expected service responses at all. To overcome this the circuit breaker will move from an open to half-open state. Here, it will allow passing through a limited number of requests to microservices. This percentage can be kept as configurable and let’s say 1 out of 10 requests. the breaker will try this and if it got 6.9ms as the response time still it will be in a half-open state because the responsiveness is still not aligned with the expected value. If it will suddenly mark 1.6ms as the response time, then the breaker will close the circuit, and the route of the request is back to normal.

Practical implementation

We have created sample microservices with hystrix to describe the circuit breaker functionality. We have used spring initializer to initiate our services and used Web, Rest Repositories, and Actuator dependencies for agnasarp-hystrix-employee-service and Web, Actuator, Hystrix, and Hystrix Dashboard for agnasarp-hystrix-department-service.

Microservice: agnasarp-hystrix-employee-service

Project structure

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.4.2</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.agnasarp</groupId>
<artifactId>agnasarp-hystrix-employee-service</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>agnasarp-hystrix-employee-service</name>
<description>Agnasarp Hystrix Employee Service</description>
<properties>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-rest</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>

</project>

application.properties

server.port = 8280

EmployeeServiceApplication.java

package com.agnasarp.employeeservice;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class EmployeeServiceApplication {

public static void main(String[] args) {
SpringApplication.run(EmployeeServiceApplication.class, args);
}

}

EmployeeServiceController.java

package com.agnasarp.employeeservice.controller;

import com.agnasarp.employeeservice.domain.Employee;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;


@RestController
public class EmployeeServiceController {

private static Map<String, List<Employee>> agnasarpDB = new HashMap<String, List<Employee>>();

static {
agnasarpDB = new HashMap<String, List<Employee>>();

List<Employee> lst = new ArrayList<Employee>();
Employee std = new Employee("Max", "Male");
lst.add(std);
std = new Employee("Lisa", "Female");
lst.add(std);

agnasarpDB.put("it-department", lst);

lst = new ArrayList<Employee>();
std = new Employee("Steve", "Male");
lst.add(std);
std = new Employee("Anne", "Female");
lst.add(std);

agnasarpDB.put("finance-department", lst);

}

@GetMapping(value = "/getEmployeeDetailsForDepartment/{department}")
public List<Employee> getEmployees(@PathVariable String department) {
System.out.println("Getting Employee details for " + department);

List<Employee> employeeList = agnasarpDB.get(department);
if (employeeList == null) {
employeeList = new ArrayList<Employee>();
Employee std = new Employee("Not Found", "N/A");
employeeList.add(std);
}
return employeeList;
}
}

Employee.java

package com.agnasarp.employeeservice.domain;

public class Employee {

private String name;
private String gender;

public Employee(String name, String gender) {
super();
this.name = name;
this.gender = gender;
}

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

public String getGender() {
return gender;
}

public void setGender(String gender) {
this.gender = gender;
}
}

Service call

Request URL: http://localhost:8280/getEmployeeDetailsForDepartment/it-department

Service consumer: agnasarp-hystrix-department-service

Project structure

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.9.BUILD-SNAPSHOT</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>com.agnasarp</groupId>
<artifactId>agnasarp-hystrix-department-service</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>agnasarp-hystrix-department-service</name>
<description>Agnasarp Hystrix Department Service</description>
<properties>
<java.version>1.8</java.version>
<spring-cloud.version>Hoxton.BUILD-SNAPSHOT</spring-cloud.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-rest</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-dependencies</artifactId>
<version>${spring-cloud.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>

<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>spring-milestones</id>
<name>Spring Milestones</name>
<url>https://repo.spring.io/milestone</url>
</repository>
<repository>
<id>spring-snapshots</id>
<name>Spring Snapshots</name>
<url>https://repo.spring.io/snapshot</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>spring-milestones</id>
<name>Spring Milestones</name>
<url>https://repo.spring.io/milestone</url>
</pluginRepository>
<pluginRepository>
<id>spring-snapshots</id>
<name>Spring Snapshots</name>
<url>https://repo.spring.io/snapshot</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>

</project>

application.properties

server.port = 8380
management.endpoints.web.exposure.include=*
management.endpoints.web.base-path=/

DepartmentServiceApplication.java

package com.agnasarp.departmentservice;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker;
import org.springframework.cloud.netflix.hystrix.dashboard.EnableHystrixDashboard;
import org.springframework.context.annotation.ComponentScan;

@SpringBootApplication
@EnableHystrixDashboard
@EnableCircuitBreaker
@ComponentScan(basePackages = "com.agnasarp.departmentservice.*")
public class DepartmentServiceApplication {

public static void main(String[] args) {
SpringApplication.run(DepartmentServiceApplication.class, args);
}
}

DepartmentServiceController.java

package com.agnasarp.departmentservice.controller;

import com.agnasarp.departmentservice.service.DepartmentServiceDelegate;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class DepartmentServiceController {

private DepartmentServiceDelegate departmentServiceDelegate;

public DepartmentServiceController(DepartmentServiceDelegate departmentServiceDelegate) {
this.departmentServiceDelegate = departmentServiceDelegate;
}

@GetMapping(value = "/getDepartmentDetails/{department}")
public String getEmployees(@PathVariable String department) {
System.out.println("Going to call employee service to get data!");
return departmentServiceDelegate.callEmployeeServiceAndGetData(department);
}
}

DepartmentServiceDelegate.java

package com.agnasarp.departmentservice.service;

import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.core.ParameterizedTypeReference;
import org.springframework.http.HttpMethod;
import org.springframework.stereotype.Service;
import org.springframework.web.client.RestTemplate;

import java.util.Date;

@Service
public class DepartmentServiceDelegate {

@Autowired
RestTemplate restTemplate;

@Bean
public RestTemplate restTemplate() {
return new RestTemplate();
}

@HystrixCommand(fallbackMethod = "callEmployeeServiceAndGetDataFallback")
public String callEmployeeServiceAndGetData(String department) {

System.out.println("Getting Department details for " + department);

String response = restTemplate
.exchange("http://localhost:8280/getEmployeeDetailsForDepartment/{department}"
, HttpMethod.GET
, null
, new ParameterizedTypeReference<String>() {
}, department).getBody();

System.out.println("Response Received as " + response + " - " + new Date());

return "NORMAL FLOW !!! - Department Name - " + department + " ::: " +
" Employee Details " + response + " - " + new Date();
}

@SuppressWarnings("unused")
private String callEmployeeServiceAndGetDataFallback(String department) {

System.out.println("Employee Service is down!!! fallback route enabled...");

return "CIRCUIT BREAKER ENABLED!!! No Response From Employee Service at this moment. " +
" Service will be back shortly - " + new Date();
}
}

Service call

Request URL: http://localhost:8380/getDepartmentDetails/it-department

When the employee service up and running. the circuit breaker is in its closed state.

When the employee service is down. the circuit breaker is in its open state and the fallback method callEmployeeServiceAndGetDataFallback will be called.

Hystrix stream

Hystrix is continuously monitoring the employee service therefore, once the service is back to normal then the circuit will change its state from open to close and we will get the success response from the department service.

Hystrix dashboard

Visual representation of the circuit in the hystrix dashboard

Unfortunately, it is not working here, but if you configure it properly you will definitely get the visual representation.

Download source from github:

Download employee-service: https://github.com/Agnasarp/agnasarp-hystrix-employee-service
Download department-service: https://github.com/Agnasarp/agnasarp-hystrix-department-service

Originally published at https://www.agnasarp.com on February 8, 2021.