Adding Health Checks And SSL Certificate in Kong Db-Less Mode

Adding Health Checks And SSL Certificate in Kong Db-Less Mode

Recently I was working on an open-source project on my current company where all requests come to the API gateway service then the gateway redirects the request to the appropriate service based on the URL pattern.

Those who work with Gateway services know there are many of them out there to choose from, e.g. Amazon API Gateway, Apigee Edge, Postman, Kong, etc. Kong is one of the most popular ones out there and in our project, we're also using Kong as an API gateway.

I'm new to Kong, Our code base uses Lua programming language to write custom kong plugins.

What is KONG?

Kong is Orchestration Microservice API Gateway. Kong provides a flexible abstraction layer that securely manages communication between clients and microservices via API. Also known as an API Gateway, API Middleware, or in some cases Service Mesh. It is available as an open-source project in 2015, its core values are high performance and extensibility.

Kong is a Lua application running in Nginx and made possible by the lua-nginx-module.

We usually use PostgreSQL or Cassandra depending on our need to store kong configuration. Around 2019, Kong introduces a new mode of writing configuration called Db-Less mode where we use YAML or YML to define kong configuration. which is really good as we don't need to manage a separate database application.

Note: Kong uses rest API-based endpoints to put data in DB mode but in Db-less mode, most of the admin API is read-only so we have to rely on Docs or example to write YML configuration.

This is where the problem starts as Konghq docs don't show enough examples or details about how to write the YML configuration. Konghq docs show lots of theoretical details about the feature but the lack of examples makes a newbie like me write YML configuration hard. I spend hours and hours writing just a single YML object which can be a lot easier if konghq docs provide a simple example to write the configuration.

There's another way we solve the problem that I later learned, which is using decK, A command-line tool to manage kong configuration. Another way we can also use DB mode to add all the configurations then export all the configurations to YML.

Still, I think, As Db-Less mode is a kong feature, so docs can be updated in the future to make this easier for a newbie like me.

As I was working on a gateway service that is already written in Db-less mode, I have to convert the whole configuration to DB mode then export it, which is not an option for me.

The Problem

Background

We deployed the app using the AWS ECS service which details I'll share in a separate blog post.

As a deployment strategy, we decide to deploy the services in private and attach an SRV record to each service only the Gateway service will be exposed to the public as we're using Kong as a load balancer.

After deploying the services and we're testing if working fine but as we trying to scale down the services we observing the logs and we notice that traffic was sent to an upstream service which was terminating because of the manual scale down we triggered.

After some time we found out we can add Healthcheck to the upstream host so that kong can redirect the request only to healthy targets.

Question

How we can do this in Db-Less mode?

As I explain earlier, Konghq docs are not good for this scenario at least for now. I'll provide a complete example in this post so If anyone faces the same problem as me. It's really hard for me to add the two features in the Db-less mode YML configuration. after hours of searching on the internet and asking in the forum, I was able to add the feature in the configuration.

Healthcheck Details:

The objective of the health checks functionality is to dynamically mark targets as healthy or unhealthy, for a given Kong node.

Kong provides 2 types of health check.

  • Active
  • Passive

Active checks, where a specific HTTP or HTTPS endpoint in the target is periodically requested and the health of the target is determined based on its response;

Passive checks (also known as circuit breakers), where Kong analyzes the ongoing traffic being proxied and determines the health of targets based on their behavior responding to requests.

Details about Active and Passive health check which was taken from konghq docs

Active health checks as the name implies, actively probe targets for their health. When active health checks are enabled in an upstream entity, Kong will periodically issue HTTP or HTTPS requests to a configured path at each target of the upstream. This allows Kong to automatically enable and disable targets in the balancer based on the probe results. The periodicity of active health checks can be configured separately for when a target is healthy or unhealthy. If the interval value for either is set to zero, the checking is disabled at the corresponding scenario. When both are zero, active health checks are disabled altogether. Note: Active health checks currently only support HTTP/HTTPS targets. They do not apply to Upstreams assigned to Services with the protocol attribute set to "tcp" or "tls".

Passive health checks (circuit breakers) Passive health checks, also known as circuit breakers, are checks performed based on the requests being proxied by Kong (HTTP/HTTPS/TCP), with no additional traffic being generated. When a target becomes unresponsive, the passive health checker will detect that and mark the target as unhealthy. The ring-balancer will start skipping this target, so no more traffic will be routed to it. Once the problem with a target is solved and it is ready to receive traffic again, the Kong administrator can manually inform the health checker that the target should be enabled again, via an Admin API endpoint:

$ curl -i -X POST http://localhost:8001/upstreams/my_upstream/targets/10.1.2.3:1234/healthy
HTTP/1.1 204 No Content

This command will broadcast a cluster-wide message to propagate the “healthy” status to the whole Kong cluster. This will cause Kong nodes to reset the health counters of the health checkers running in all workers of the Kong node, allowing the ring-balancer to route traffic to the target again. Passive health checks have the advantage of not producing extra traffic, but they are unable to automatically mark a target as healthy again: the “circuit is broken”, and the target needs to be re-enabled again by the system administrator.

# kong.yml -> upstreams object
upstreams:
  # Auth service 
  - algorithm: round-robin
    name: ${AUTH_URL}
    healthchecks:
      threshold: 2
      active:
        unhealthy:
          http_statuses: ${UPSTREAM_ACTIVE_UNHEALTHY_HTTP_STATUSES}
          timeouts: ${UPSTREAM_ACTIVE_UNHEALTHY_TIMEOUTS_COUNT}
          http_failures: ${UPSTREAM_PASSIVE_UNHEALTHY_TCP_FAILURES_COUNT}
          interval: ${UPSTREAM_UNHEALTHY_REFRESH_INTERVAL}
        type: ${UPSTREAM_HEALTHCHECK_URL_PROTOCOL}
        http_path: ${UPSTREAM_HEALTHCHECK_URL}
        timeout: ${UPSTREAM_HEALTHCHECK_TIMEOUT}
        healthy:
          successes: ${UPSTREAM_ACTIVE_HEALTHY_SUCCESS_COUNT}
          interval: ${UPSTREAM_HEALTHY_REFRESH_INTERVAL}
          http_statuses: ${UPSTREAM_HEALTHY_HTTP_STATUSES}
        https_verify_certificate: false
        concurrency: ${UPSTREAM_ACTIVE_CONCURRENT_REQUEST}
      passive:
        unhealthy:
          http_failures: ${UPSTREAM_PASSIVE_UNHEALTHY_HTTP_FAILURES_COUNT}
          http_statuses: ${UPSTREAM_PASSIVE_UNHEALTHY_HTTP_STATUSES}
          tcp_failures: ${UPSTREAM_PASSIVE_UNHEALTHY_TCP_FAILURES_COUNT}
          timeouts: ${UPSTREAM_PASSIVE_UNHEALTHY_TIMEOUT_COUNT}
        healthy:
          http_statuses: ${UPSTREAM_HEALTHY_HTTP_STATUSES}
          successes: ${UPSTREAM_PASSIVE_HEALTHY_SUCCESS_COUNT}
        type: ${UPSTREAM_HEALTHCHECK_URL_PROTOCOL}
    slots: 10000

Let's talk about variables in detail to understand what those mean. Also what I'm about to discuss here, can be found in kong docs here

name : implies the service hostname

algorithm: which algorithm to use when load-balancing. we using round-robin which is the default value, other options are consistent-hashing, least-connections. Default set to round-robin. let's explore the health checks object.

threshold: The minimum percentage of the upstream’s targets’ weight that must be available for the whole upstream to be considered healthy. Default: 0

slots: The number of slots in the load balancer algorithm. If the algorithm is set to round-robin, this setting determines the maximum number of time slots. what this means is several time slots are created into a ring balancer and based on weight distributed on the time slots (~10000 default value). Then a pointer is incremented to find a peer to use.

active object has 2 nested objects healthy and unhealthy. This configuration will be used by kong to trigger health check events which will be executed by workers. let's explore that.

active.type: Default set to http. Whether to perform active health checks using HTTP or HTTPS. other supported types are tcp, https, grpc, grpcs but currently tcp will not work for active health check.

active.http_path: Default set to /. Service health API path

active.timeout: Default set to 1. Socket timeout for active health checks (in seconds).

active.concurrency: Number of targets to check concurrently in active health checks. Default set to 10.

healthchecks.active.https_verify_certificate: where to verify the SSL certificate of the target. Default set to true.

active.healthy.http_statuses & active.unhealthy.http_statuses both refer to an array of status codes that will be used to determine if the target is healthy or unhealthy.

active.healthy.interval & active.unhealthy.interval refer to time which will be used to refresh the targets' healthy statuses. setting to 0 will disable the health check.

http_failures, timeouts, tcp_failures all variables refer to the count which will be used to verify if the target is healthy or not. It's like a threshold value.

successes refer to the successes count.

In the above, we define the upstream object but to work, the upstream object needs targets. which can be defined in the health checks or separately. we will define separately like below

# kong.yml -> targets object
targets:
  - upstream: ${AUTH_URL}
    target: ${AUTH_URL}:80
    weight: 1

targets is an array of values that will attach to the upstreams name property. This is what we defined in the targets.upstream.

targets.target refers to the ip:port address combination of the target. if the target is an SRV record just put an SRV record.

targets.weight refer to the weight of the target.

Another problem arises when we try to add the SSL certificate in the certificates object in kong.yml

# kong.yml
certificates:
  - cert: # public key
    key: # private key
    snis:
      - name: "" # domain name or wildcard domain name *.example.com

certificates object looks like the above but cert & key value needs to be defined as multiline

certificates:
  - cert: |-
      -----BEGIN CERTIFICATE-----
      -----END CERTIFICATE-----
    key: |-
      -----BEGIN PRIVATE KEY-----
      -----END PRIVATE KEY-----
    snis:
      - name: "*.example.com"

As this is not documented in the konghq docs. so anyone who tries to add the certificate will get an invalid certificate error.

If you like, you can read the same article on our official blog

You can read our other official blog-posts Here

You can read my other blog-posts Here

In conclusion, Kong is a great and popular API gateway. The DB-less mode can be really helpful if anyone doesn't want to maintain a DB instance. But lack of documentation about the Db-less mode may lead to frustration. Maintaining up-to-date docs is hard I know because I'm a developer too so I can tell, Hopefully in the future kong will eventually add proper documentation about Db-less mode.