Clustering Application - Architecture

Figure 1: Clustering Architecture

Components

Clustering provides high-availability, scalability, and manageability of the resources and applications by grouping multiple servers that are running Adeptia Suite. There are a number of components, which make that possible. Important components of clustering service include the following:

Kernel

Kernel is a runtime environment of Adeptia Suite. It handles all the execution jobs, for e.g., the execution of Process Flows, Scheduling of Events, Queue Processing, and Process flow Recoveries. When you enable clustering, the execution of process flows is distributed among Kernels of each instance of Adeptia Suite server. All the execution requests first go to the Primary Node. The Primary Node then distributes the execution jobs among all other kernels. It is the Primary Node, which manages Scheduling of Events and Queue Processing. If any node goes down, the Primary Node identifies the process flows, which are running on that node, recovers them and again distributes the execution among available nodes.

WebRunner

WebRunner handles all the user's requests such as creating, editing, and deleting of activities through GUI. When you enable clustering in Adeptia Suite, it is enabled only for Kernel, and not for WebRunner. If you also want to load balance the GUI requests, you can use an external load balancer for WebRunners (see Figure 1 ). Make sure that the WebRunner runs on all the nodes. Secure Bridge and Secure Engine users must always use an external load balancer.

Shared Location for all the Nodes

To enable cluster, it is important that you set up a shared location which can be accessed by all the nodes of the cluster. NFS (Network File System) is basically developed for sharing of files and folders between Linux/Unix systems. It allows you to mount your local file systems over a network and remote hosts to interact with them as they are mounted locally on the same system. With the help of NFS, you can set up file sharing between all of the following folders and databases. For details, refer How to Setup NFS (Network File System) section.

Back-end database

A back-end database is used to store the objects. All the activities (i.e. Process flows, activities, users, etc.) created through the GUI are stored in the backend database. By default, Adeptia Suite uses HSQLDB (which is an embedded database) as backend.
While setting up a clustered environment:

- All the nodes of the cluster should use the same backend.

- Adeptia Suite does not provide Clustering or Failover setup for the databases, however, you can set that up according to the database you use. For load sharing purposes, it is recommended to configure master/slave or replication (refer to the related database documentation).

Log Database:

Adeptia Suite maintains logs of all the design time and run time activities that you run within Adeptia Suite. For example Process flow log, event log, etc. Adeptia Suite writes all these logs into the log database. All the nodes of the cluster should use same log database. In addition, for load sharing purposes, it is recommended to configure master/slave or replication (refer to the related database documentation).

Repository Folder

When the process flow is executed, data from the source is converted to the intermediate form and then it is dispatched to the target. The intermediate data is stored in a repository folder. This should be a shared folder in the network, which can be accessed by all the nodes of the cluster. There should not be any username/password required to connect to this folder.

Recovery Folder

During execution of a process flow, its current state is stored in a recovery file. These recovery files are stored in a recovery folder. Whenever a process flow aborts due to Kernel shutdown, the Recovery feature handles it automatically with the help of recovery files. These files, remains in the recovery folder unless the process flow execution is completed. This folder should be shared among all the nodes of the cluster.

Rerun Folder

The process flows may also be aborted due to any other reason e.g., incorrect data mapping or schema definition. While execution, at every checkpoint, the Process Flow stores its current state in a rerun file. With the help of these files you can rerun the process flow. These files are stored in a rerun folder, this folder should also be shared among all the nodes of the cluster.