Page Comparison

Horizontal Pod Autoscaling (HPA) governs the spinning up of additional pods when the existing resources (CPU and memory) of the microservice are exhausted or the message count threshold (runtime) for the queue is exceeded. The deletion of the additional pods occurs as and when the resources and the message count values are below their threshold values.

In Adeptia Connect, you can configure and use either Kubernetes HPA (default), or Kubernetes Event Driven Autoscaler (KEDA) for autoscaling of the microservices' pods. If you want to autoscale the runtime pods based on Message Queue, including CPU, and memory, you need to use KEDA.

To use install KEDA first need to install it , refer to the Deploying KEDA page.

When you use KEDA,

The autoscaling of runtime pods happens can happen based on the threshold values for Message Queue , or CPU , and memory you set or memory, or any combination of these three parameters. You can make these configurations in the global values.yaml file.
To use KEDA, you first need to enable it by setting the type property value for the type variable to keda under global > config > autoscaling section in the values.yaml file as shown in the following screenshot.
Image Removed
For more details, refer to To set the other relevant parameters, for example, the threshold number of messages in the Message Queue, refer to this section.
Image Added
Tip
For a dedicated runtime (Deployment) pod, you need to set the threshold values for Message Queue, CPU, and memory while creating the Deployment. For more details, refer to this page.
The autoscaling of other microservices' pods happens can happen based only on the threshold values for CPU and memory you set the threshold values for CPU or memory, or both. You can make these configurations in the global values.yaml file. For more details, refer to this section.
To install KEDA, refer to this page.

When you use Kubernetes' HPA,

The autoscaling of runtime pods happens pods can happen based only on the threshold values for CPU, and memory you set the threshold values for CPU or memory, or both. You can make these configurations in the global values.yaml file. For more detailsTo set the relevant parameters in the values.yaml file, refer to to this section.
Tip
Ensure that the value for the type variable under global > config > autoscaling section in the values.yaml file is set to hpa.

Tip
For a dedicated runtime (Deployment) pod, you need to set the threshold values for CPU and memory while creating the Deployment. For more details, refer to this page.
The autoscaling of the other microservices' pods happens based only on the threshold values for CPU and memory you set other microservices' pods can happen based on the threshold values for CPU or memory, or both. You can make these configurations in the global values.yaml file. For more details, refer to . To set the relevant parameters in the values.yaml file, refer to this section.

Anchor

	HPA runtime microserviceHPA
	runtime microservice

Configuring

...

autoscaling for runtime microservice

The parameters for configuring the runtime microservice for autoscaling slightly differ from those for the rest of the microservices. The following table describes the autoscaling parameters for runtime microservice. You can find these parameters in the respective section of each microservice the runtimeImage: section in the global values.yaml file.

HPA Parameter cpu or memory or both. The possible values for this parameter can be cpu, memory, and cpu-memory. the HPA spins up the HPA a a

Parameter

Description

Default value

autoscalingRUNTIME_AUTOSCALING_ENABLED:

Anchor

type

enabled:

Parameter to enable

autoscaling by setting its value to true.

true

~~type:~~

RUNTIME_MIN_POD:

Anchor

	RUNTIME_AUTOSCALING_TYPE
	RUNTIME_AUTOSCALING_TYPE

Minimum number of pods.

1

RUNTIME_MAX_POD:

The maximum number of pods the runtime microservice can scale up to.

1

RUNTIME_AUTOSCALING_CRITERIA_MESSAGE_COUNT:

Variable to define whether you want the autoscaling to happen based on

cpu

criteria:

applicable only when keda is enabled

cpu: true

memory: false

minReplicas:

Minimum number of pods for a microservice.

1

maxReplicas:

The maximum number of pods a microservice can scale up to.

1

targetCPUUtilizationPercentage:

Message Queue count.

Setting the value for this variable to true denotes that the autoscaling of the runtime pod happens based on the number of messages in queued state in the Message Queue.

Info
This variable is applicable only when you use KEDA for autoscaling.

true

RUNTIME_AUTOSCALING_CRITERIA_CPU:

Variable to define whether you want the autoscaling to happen based on CPU usage.

Setting the value for this variable to true denotes that the autoscaling of the runtime pod happens based on the CPU usage.

true

RUNTIME_AUTOSCALING_CRITERIA_MEMORY:

Variable to define whether you want the autoscaling to happen based on memory usage.

Setting the value for this variable to true denotes that the autoscaling of the runtime pod happens based on the memory usage.

false

RUNTIME_AUTOSCALING_TARGETCPUUTILIZATIONPERCENTAGE:

Value in percentage of CPU requests set in the global values.yaml for the runtime pods at which

a new pod spins up.

400

targetMemoryUtilizationPercentage: RUNTIME_AUTOSCALING_TARGETMEMORYUTILIZATIONPERCENTAGE:

Value in percentage of memory requests set in the global values.yaml for the runtime pods at which

a new pod spins up.

400

RUNTIME_AUTOSCALING_QUEUE_MESSAGE_COUNT:

scaleUp:

stabilizationWindowSeconds:

The threshold value of the number of messages in queued state in the Message Queue at which KEDA spins up a new pod.

400

behavior:

Info
This variable is applicable only when you use KEDA for autoscaling.

RUNTIME_SCALE_UP_STABILIZATION_WINDOW_SECONDS:

The duration (in seconds) for which the application keeps a watch on the spikes in the resource utilization by the currently running pods. This helps in determining whether scaling up is required or not.

300

maxPodToScaleUpRUNTIME_MAX_POD_TO_SCALE_UP:

The maximum number of pods

the runtime microservice can scale up to at a time.

1

periodSecondsRUNTIME_SCALE_UP_PERIOD_SECONDS:

The time duration (in seconds) that sets the frequency of tracking the spikes in the resource utilization by the currently running pods.

60

scaleDown:

stabilizationWindowSeconds:

RUNTIME_SCALE_DOWN_STABILIZATION_WINDOW_SECONDS:

The duration (in seconds) for which the application keeps a watch for drop in resource utilization by the currently running pods. This helps in determining whether scaling down is required or not.

300

maxPodToScaleDown: RUNTIME_MAX_POD_TO_SCALE_DOWN:

The maximum number of pods

the runtime microservice can scale down to at a time.

1

periodSeconds: RUNTIME_SCALE_DOWN_PERIOD_SECONDS:

The time duration (in seconds) that sets the frequency of tracking the drop in the resource utilization by the currently running pods.

60

...

Anchor

...

other microservices
other microservices

Configuring

...

autoscaling for other microservices

...

(excluding runtime)

To enable HPA, you need to set the parameters as described below for each of the microservices individually. You can find these parameters in the respective section of each microservice in the global values.yaml file.

Parameter	Description	Default value

RUNTIME_AUTOSCALING_ENABLED:

autoscaling:

Anchor

	type
	type

enabled:

Parameter to enable

HPA AnchorRUNTIME_AUTOSCALING_TYPERUNTIME_AUTOSCALING_TYPEMinimum number of pods.1

RUNTIME_MAX_POD:

The maximum number of pods the runtime microservice can scale up to.1

~~RUNTIME_AUTOSCALING_TYPE~~

Parameter

autoscaling by setting its value to true.

true

RUNTIME_MIN_POD:

criteria:

cpu:

Variable to define whether you want the autoscaling to happen based on CPU usage.

Setting the value for this variable to true denotes that the autoscaling of the microservices pods happens based on the CPU usage.

true

memory:

Variable to define whether you want the autoscaling to happen based on

cpu or

memory

or both. The possible values for this parameter can be cpu, memory, and cpu-memory.cpu

RUNTIME_AUTOSCALING_CRITERIA_MESSAGE_COUNT: true

RUNTIME_AUTOSCALING_CRITERIA_CPU: true

RUNTIME_AUTOSCALING_CRITERIA_MEMORY: false

RUNTIME_AUTOSCALING_TARGETCPUUTILIZATIONPERCENTAGE:

usage. Setting the value for this variable to true denotes that the autoscaling of the microservices pods happens based on the memory usage.	false
minReplicas:	Minimum number of pods for a microservice.	1
maxReplicas:	The maximum number of pods a microservice can scale up to.	1
targetCPUUtilizationPercentage:	Value in percentage of CPU requests set in the global values.yaml for the

runtime

pods at which the HPA spins up a new pod.

400

RUNTIME_AUTOSCALING_TARGETMEMORYUTILIZATIONPERCENTAGE:

targetMemoryUtilizationPercentage:

Value in percentage of memory requests set in the global values.yaml for the

runtime

pods at which the HPA spins up a new pod.

400

RUNTIME_AUTOSCALING_QUEUE_MESSAGE_COUNT: 5RUNTIME_SCALE_UP_STABILIZATION_WINDOW_SECONDS:

behavior:
scaleUp:
stabilizationWindowSeconds:	The duration (in seconds) for which the application keeps a watch on the spikes in the resource utilization by the currently running pods. This helps in determining whether scaling up is required or not.	300

RUNTIME_MAX_POD_TO_SCALE_UP

maxPodToScaleUp:

The maximum number of pods

the runtime

a microservice can scale up to at a time.

1

RUNTIME_SCALE_UP_PERIOD_SECONDS

periodSeconds:

The time duration (in seconds) that sets the frequency of tracking the spikes in the resource utilization by the currently running pods.

60

RUNTIME_SCALE_DOWN_STABILIZATION_WINDOW_SECONDS:

scaleDown:
stabilizationWindowSeconds:	The duration (in seconds) for which the application keeps a watch for drop in resource utilization by the currently running pods. This helps in determining whether scaling down is required or not.	300

RUNTIME_MAX_POD_TO_SCALE_DOWN:

maxPodToScaleDown:

The maximum number of pods

the runtime

a microservice can scale down to at a time.

1

RUNTIME_SCALE_DOWN_PERIOD_SECONDS:

periodSeconds:

The time duration (in seconds) that sets the frequency of tracking the drop in the resource utilization by the currently running pods.

60

Load balancing among the runtime pods

Kubernetes internally handles the load balancing of requests from a Queue to the runtime pods of the corresponding Deployment. There are two types of requests – Synchronous, and Asynchronous – that are processed by the runtime pods.

...

When all the three runtime pods are completely occupied, the other messages in the queue are prioritized and routed to a runtime pod when it gets free and has a vacancy.

...

Version	Old Version 8	New Version Current
Changes made by	Ashhad Alam	Shruti Pasayat
Saved on	Jan 19, 2023	Oct 04, 2023

Versions Compared

Key

Configuring

autoscaling for runtime microservice

Anchor

other microservices
other microservices

Configuring

autoscaling for other microservices

(excluding runtime)

Load balancing among the runtime pods

Related topic

Page Comparison

Versions Compared

Key

Configuring

autoscaling for runtime microservice

Anchor

other microservicesother microservices

Configuring

autoscaling for other microservices

(excluding runtime)

Load balancing among the runtime pods

Related topic

other microservices
other microservices