Page Comparison

Horizontal Pod Autoscaling (HPA) governs the spinning up of additional pods when the existing resources (CPU and Memorymemory) of the microservice are exhausted or the message count threshold (runtime) for the queue is exceeded. The deletion of the additional pods occurs as and when the resources are free or restored for the microservice. and the message count values are below their threshold values.

In Adeptia Connect, Autoscaling is by default enabled. You can enable HPA in Adeptia Connect by setting the required parameters you can configure and use either Kubernetes HPA (default), or KEDA for autoscaling of the microservices' pods.

When you use KEDA,

The autoscaling of runtime pods happens based on the threshold values for Message Queue, CPU, and memory you set in the global values.yaml file. For more details, refer to this section.
Tip
For a dedicated runtime (Deployment) pod, you need to set the threshold values for Message Queue, CPU, and memory while creating the Deployment. For more details, refer to this page.
The autoscaling of other microservices' pods happens based only on the threshold values for CPU and memory you set in the global values.yaml file. For more details, refer to this section.

When you use Kubernetes' HPA,

The autoscaling of runtime pods happens based only on the threshold values for CPU, and memory you set in the global values.yaml file. For more details, refer to this section.
Tip
For a dedicated runtime (Deployment) pod, you need to set the threshold values for CPU and memory while creating the Deployment. For more details, refer to this page.
The autoscaling of the other microservices' pods happens based only on the threshold values for CPU and memory you set in the global values.yaml file. For more details, refer to this section.

Anchor

	HPA microservice
	HPA microservice

Configuring HPA for microservices (excluding runtime)

Are the variables same for KEDA and Kubernetes? If not, separate sections will be required.

To enable HPA, you need to set the parameters as described below for each of the microservices individually. You can find these parameters in the respective section of each microservice in the global values.yaml file.

Parameter

Description

Default value

autoscaling:

Anchor

	type
	type

enabled:

Parameter to enable HPA by setting its value to true.

true

type:

Parameter to define whether you want the autoscaling to happen based on cpu or memory or both. The possible values for this parameter can be cpu, memory, and cpu-memory.

cpu

minReplicas:

Minimum number of pods for a microservice.

1

maxReplicas:

The maximum number of pods a microservice can scale up to.

1

targetCPUUtilizationPercentage:

Value in percentage of CPU requests set in the global values.yaml for the pods at which the HPA spins up a new pod.

400

targetMemoryUtilizationPercentage:

Value in percentage of memory requests set in the global values.yaml for the pods at which the HPA spins up a new pod.

400

behavior:

scaleUp:

stabilizationWindowSeconds:

The duration (in seconds) for which the application keeps a watch on the spikes in the resource utilization by the currently running pods. This helps in determining whether scaling up is required or not.

300

maxPodToScaleUp:

The maximum number of pods a microservice can scale up to at a time.

1

periodSeconds:

The time duration (in seconds) that sets the frequency of tracking the spikes in the resource utilization by the currently running pods.

60

scaleDown:

stabilizationWindowSeconds:

The duration (in seconds) for which the application keeps a watch for drop in resource utilization by the currently running pods. This helps in determining whether scaling down is required or not.

300

maxPodToScaleDown:

The maximum number of pods a microservice can scale down to at a time.

1

periodSeconds:

The time duration (in seconds) that sets the frequency of tracking the drop in the resource utilization by the currently running pods.

60

Anchor
runtime Config
runtime Config

Configuring Kubernetes' HPA for runtime microservice

Like other microservices, the runtime microservice pods are adjusted (scaled up or scaled down) based on the two metrics – CPU utilization, and memory utilization. However, the parameters for configuring the runtime microservice for autoscaling slightly differ from those for the rest of the microservices.

The following table describes the autoscaling parameters for runtime microservice. You can find these parameters in the runtimeImage: section section in the global values.yaml file.

Parameter

Description

Default value

RUNTIME_AUTOSCALING_ENABLED:

Parameter to enable HPA by setting its value to true.

true

RUNTIME_MIN_POD:

Anchor

	RUNTIME_AUTOSCALING_TYPE
	RUNTIME_AUTOSCALING_TYPE

Minimum number of pods.

1

RUNTIME_MAX_POD:

The maximum number of pods the runtime microservice can scale up to.

1

RUNTIME_AUTOSCALING_TYPE

Parameter to define whether you want the autoscaling to happen based on cpu or memory or both. The possible values for this parameter can be cpu, memory, and cpu-memory.

cpu

RUNTIME_AUTOSCALING_TARGETCPUUTILIZATIONPERCENTAGE:

Value in percentage of CPU requests set in the global values.yaml for the runtime pods at which the HPA spins up a new pod.

400

RUNTIME_AUTOSCALING_TARGETMEMORYUTILIZATIONPERCENTAGE:

Value in percentage of memory requests set in the global values.yaml for the runtime pods at which the HPA spins up a new pod.

400

RUNTIME_SCALE_UP_STABILIZATION_WINDOW_SECONDS:

The duration (in seconds) for which the application keeps a watch on the spikes in the resource utilization by the currently running pods. This helps in determining whether scaling up is required or not.

300

RUNTIME_MAX_POD_TO_SCALE_UP:

The maximum number of pods the runtime microservice can scale up to at a time.

1

RUNTIME_SCALE_UP_PERIOD_SECONDS:

The time duration (in seconds) that sets the frequency of tracking the spikes in the resource utilization by the currently running pods.

60

RUNTIME_SCALE_DOWN_STABILIZATION_WINDOW_SECONDS:

The duration (in seconds) for which the application keeps a watch for drop in resource utilization by the currently running pods. This helps in determining whether scaling down is required or not.

300

RUNTIME_MAX_POD_TO_SCALE_DOWN:

The maximum number of pods the runtime microservice can scale down to at a time.

1

RUNTIME_SCALE_DOWN_PERIOD_SECONDS:

The time duration (in seconds) that sets the frequency of tracking the drop in the resource utilization by the currently running pods.

60

Load balancing among the runtime pods

...

Synchronous requests are processed by any random runtime pod that is selected by Kubernetes Service when set to its default default iptables proxy mode.

The Asynchronous requests are processed based on the concurrency level you set for the runtime pods of the Deployment. For example, if there are three (3) runtime pods (each having a concurrency of 5) and eight (8) messages in the Queue, here is how they will be routed:

...

When all the three runtime pods are completely occupied, the other messages in the queue are prioritized and routed to a runtime pod when it gets free and has a vacancy.

Anchor

	runtime Config KEDA
	runtime Config KEDA

Configuring KEDA for runtime microservice

...

Version	Old Version 1	New Version 2
Changes made by	Rohan Dhanwade	Rohan Dhanwade
Saved on	Jul 29, 2022	Jan 04, 2023

Versions Compared

Key

Configuring HPA for microservices (excluding runtime)

Anchor
runtime Config
runtime Config

Configuring Kubernetes' HPA for runtime microservice

Load balancing among the runtime pods

Configuring KEDA for runtime microservice

Related topic

Page Comparison

Versions Compared

Key

Configuring HPA for microservices (excluding runtime)

Anchorruntime Configruntime Config

Configuring Kubernetes' HPA for runtime microservice

Load balancing among the runtime pods

Configuring KEDA for runtime microservice

Related topic

Anchor
runtime Config
runtime Config