Content Comparison

Horizontal Pod Autoscaloing Horizontal Pod Autoscaling (HPA) governs the spinning up and deletion of additional pods when the existing resources (CPU and Memory) of the microservice are exhausted. The deletion of the additional pods occurs as and when the resources are free or restored for the microservice. In Adeptia Connect, Autoscaling is by default disabledenabled. You can enable HPA in Adeptia Connect by setting the required parameters in the global values.yaml file.

To enable HPA, you need to set the parameters as described below for each of the microservices individually.

The following screenshot illustrates the autoscaling parameters for webrunner microservice.

...

You can find these parameters in the respective section of each microservice in the global values.yaml file.

...

value utilization autoscaler autoscaler

Parameter

Description

Default value

autoscaling:

Anchor

	type
	type

enabled:

Parameter to enable HPA by setting its value to true.

true

type:

Parameter to define whether you want the autoscaling to happen based on cpu or memory or both. The possible values for this parameter can be cpu, memory, and cpu-memory.

cpu

minReplicas:

Minimum number of pods for a microservice.

1

maxReplicas:

The maximum number of pods a microservice can scale up to.

1

targetCPUUtilizationPercentage: The

Value in percentage

of CPU

requests set in the global values.yaml for the pods at which the

HPA spins up a new pod.

targetmemoryUtilizationPercentage

The percentage value of memory utilization 400

targetMemoryUtilizationPercentage:

Value in percentage of memory requests set in the global values.yaml for the podsat which the

HPA spins up a new pod.

Configuring HPA for runtime microservice

The HPA configuration for the runtime microservice differs from that of the rest of the microservices. An additional runtime pod spins up or gets deleted based on the number of Process Flows you have in the queued state. As the number of Process Flows in the queue gets beyond the threshold value, a new runtime pod is created.

The configuration steps for HPA depend upon whether the Process Flows are running on ashared or dedicated queue. The sections given below describe how the runtime HPA is configured in the following scenarios.

When the Process Flows to run on the shared queue

If you are configuring the HPA before deployment of the application, set the parameters in the global values.yaml file as instructed below:

...

	400
behavior:
scaleUp:
stabilizationWindowSeconds:	The duration (in seconds) for which the application keeps a watch on the spikes in the resource utilization by the currently running pods. This helps in determining whether scaling up is required or not.	300
maxPodToScaleUp:	The maximum number of pods a microservice can scale up to at a time.	1
periodSeconds:	The time duration (in seconds) that sets the frequency of tracking the spikes in the resource utilization by the currently running pods.	60
scaleDown:
stabilizationWindowSeconds:	The duration (in seconds) for which the application keeps a watch for drop in resource utilization by the currently running pods. This helps in determining whether scaling down is required or not.	300
maxPodToScaleDown:	The maximum number of pods a microservice can scale down to at a time.	1
periodSeconds:	The time duration (in seconds) that sets the frequency of tracking the drop in the resource utilization by the currently running pods.	60

Anchor
runtime Config
runtime Config

Configuring HPA for runtime microservice

Like other microservices, the runtime microservice pods are adjusted (scaled up or scaled down) based on the two metrics – CPU utilization, and memory utilization. However, the parameters for configuring the runtime microservice for autoscaling slightly differ from those for the rest of the microservices.

The following table describes the autoscaling parameters for runtime microservice. You can find these parameters in the runtimeImage: section in the global values.yaml file.

...

Parameter

Description

Default value

RUNTIME_AUTOSCALING_ENABLED:

Parameter to enable HPA by setting its value to true.

true

RUNTIME_MIN_POD:

Anchor

	RUNTIME_AUTOSCALING_TYPE
	RUNTIME_AUTOSCALING_TYPE

Minimum number of pods.

1

RUNTIME_MAX_POD:

The maximum number of pods the runtime microservice can scale up to.

...

1
RUNTIME_AUTOSCALING_TYPE	Parameter to define whether you want the autoscaling to happen based on cpu or memory or both. The possible values for this parameter can be cpu, memory, and cpu-memory.	cpu
RUNTIME_AUTOSCALING_TARGETCPUUTILIZATIONPERCENTAGE:	Value in percentage of CPU requests set in the global values.yaml for the runtime pods at which the HPA spins up a new

...

pod.

...

If you want to configure the HPA after deployment of the application, follow the steps given below.

Go to /shared folder in the PVC.
Open AUTOSCALING file in edit mode.
Update the file as shown in the example below.
Code Block
language css
theme Midnight
2|3|7|My_Namespace|shared_queue|deployment_1
Where,
2 - Minimum number of pods for runtime microservice.
3 - The maximum number of pods the runtime microservice can scale up to.
7 - The number of Process Flows in the shared queue beyond which the autoscaler spins up a new runtime pod.
My_Namespace - Namespace
shared_queue - Name of the shared queue
deployment_1 - Name of the deployment
Save the file.
Restart the runtime microservice.

Info
Changes are reflected within 30 secs.

When the Process Flows to run on a dedicated queue

If your Process flows are running on a dedicated queue, you can configure the HPA by following the steps given on the page "Creating a queue".

...

title	Important

...

	400
RUNTIME_AUTOSCALING_TARGETMEMORYUTILIZATIONPERCENTAGE:	Value in percentage of memory requests set in the global values.yaml for the runtime podsat which the HPA spins up a new pod.	400
RUNTIME_SCALE_UP_STABILIZATION_WINDOW_SECONDS:	The duration (in seconds) for which the application keeps a watch on the spikes in the resource utilization by the currently running pods. This helps in determining whether scaling up is required or not.	300
RUNTIME_MAX_POD_TO_SCALE_UP:	The maximum number of pods the runtime microservice can scale up to at a time.	1
RUNTIME_SCALE_UP_PERIOD_SECONDS:	The time duration (in seconds) that sets the frequency of tracking the spikes in the resource utilization by the currently running pods.	60
RUNTIME_SCALE_DOWN_STABILIZATION_WINDOW_SECONDS:	The duration (in seconds) for which the application keeps a watch for drop in resource utilization by the currently running pods. This helps in determining whether scaling down is required or not.	300
RUNTIME_MAX_POD_TO_SCALE_DOWN:	The maximum number of pods the runtime microservice can scale down to at a time.	1
RUNTIME_SCALE_DOWN_PERIOD_SECONDS:	The time duration (in seconds) that sets the frequency of tracking the drop in the resource utilization by the currently running pods.	60

Load balancing among the runtime pods

Kubernetes internally handles the load balancing of requests from a Queue to the runtime pods of the corresponding Deployment. There are two types of requests – Synchronous, and Asynchronous – that are processed by the runtime pods.

Synchronous requests are processed by any random runtime pod that is selected by Kubernetes Service when set to its default iptables proxy mode.

The Asynchronous requests are processed based on the concurrency level you set for the runtime pods of the Deployment. For example, if there are three (3) runtime pods (each having a concurrency of 5) and eight (8) messages in the Queue, here is how they will be routed:

The first runtime pod will take up five (5) of the eight (8) messages.
The second runtime pod will take the rest of the three (3) messages.
The third runtime pod will remain unoccupied until there are more than ten (10) messages at a time.

When all the three runtime pods are completely occupied, the other messages in the queue are prioritized and routed to a runtime pod when it gets free and has a vacancy.

Image Added

Version	Old Version 1	New Version Current
Changes made by	Rohan Dhanwade	Rohan Dhanwade
Saved on	May 06, 2022	Jul 29, 2022

Versions Compared

Key

Configuring HPA for runtime microservice

When the Process Flows to run on the shared queue

Anchor
runtime Config
runtime Config

Configuring HPA for runtime microservice

When the Process Flows to run on a dedicated queue

Load balancing among the runtime pods

Content Comparison

Versions Compared

Key

Configuring HPA for runtime microservice

When the Process Flows to run on the shared queue

Anchorruntime Configruntime Config

Configuring HPA for runtime microservice

When the Process Flows to run on a dedicated queue

Load balancing among the runtime pods

Anchor
runtime Config
runtime Config