System Monitoring
1. Prerequisite
1.1 Take the Downtime from the Client
Coordinate with the client to schedule a downtime period for maintenance.
1.2 Pause the Scheduler
Pause the scheduler to prevent any scheduled tasks from running during maintenance.
1.3 Scale Down the Webrunner Pod
Scale down the webrunner pod to minimize resource usage and prevent conflicts during the health check.
1.4 Identify Free Space in the Database
1.4.1 SQL Query to Check Free Space
Run the following query to identify free space in the database: SQL Code
SELECT table_schema "DataBase Name", sum(data_length + index_length) / 1024 / 1024 "Database Size in MB", sum(data_free) / 1024 / 1024 "Free Space in MB" FROM information_schema.TABLES GROUP BY table_schema;
1.5 Refer to Log Tables for Cleanup
1.5.1 Cleanup Reference Link
Refer to the following link for the tables that need to be cleaned up:
Truncate Log Tables Manually
1.6 Run the Optimize Query
1.6.1 Optimizing Table Query
Run the following query to optimize the table. This query will create a lock on the table and then create a new table with the same name and data:
SQL code: OPTIMIZE TABLE tablename;
2. Post Maintenance
2.1 Verify Free Space After Maintenance
Run the query from section 1.4.1 to verify the free space.
2.2 Scale Up the Webrunner Pod
Scale up the webrunner pod to restore its original capacity.
2.3 Resume the Scheduler
Resume the scheduler to restart any scheduled tasks.
2.4 Send Confirmation Email to Client
Send a confirmation email to the client informing them that the maintenance has been completed.
3. Monitoring CPU and Memory Utilization Trend
3.1 Login to Client Grafana
Log in to the client’s Grafana dashboard.
3.2 Navigate to the Kubernetes-Compute-Resources-Namespace-Pods Dashboard
Navigate to the specified dashboard to monitor CPU and memory utilization.
3.3 Verify the CPU Trend
Review the CPU utilization trend to ensure it is within acceptable limits.
4. Monitoring Disk Utilization Trend
4.1 Login to Client Grafana
Log in to the client’s Grafana dashboard.
4.2 Navigate to the Kubernetes-Persistent-Volumes Dashboard
Navigate to the specified dashboard to monitor disk utilization.
4.3 Verify the Disk Utilization Trend
Review the disk utilization trend to ensure it is within acceptable limits.
5. Log and Archival Cleanup Review
5.1 Login to the Environment of Each Customer
Log in to the environment of each customer.
5.2 Navigate to the System Console Page
Navigate to the system console page.
5.3 Verify the Cleanup
Verify that log and archival cleanup processes have been executed correctly.
6. Certificate Expiry
6.1 Backup the Cacert File
Take a backup of the cacert file from the specified location.
6.2 Place the New Certificate
Place the new certificate in the same location as the old one.
6.3 Add the New Certificate to the Cacert
6.3.1 Command to Add Certificate
Run the following command to add the new certificate to the cacert:
keytool -import -trustcacerts -alias BOSAPIProdCert -file "D:\BOS\certs\16_may_2023\apim.sfs.operations.dynamics.com_prod.crt" -keystore "D:\BOS\certs\16_may_2023\cacerts"
6.4 Restart the Webrunner and Runtime Microservices
Restart the webrunner and runtime microservices to apply the new certificate.
7. License Expiry
7.1 Notification of License Expiration
Monitor for email notifications regarding license expiration, typically received one month prior.
7.2 Obtain Latest License from Gaurav Gautam
Contact Gaurav Gautam to obtain the latest license.
7.3 Place the Latest License in the /shared/license Location
Place the latest license in the specified directory.
7.4 Restart the License Pod
Restart the license pod to apply the new license.
8. Regency Patch Upgrade
8.1 Login to the Regency Environment via Remote Desktop
Use a remote desktop to access the Regency environment.
8.2 Scale Down the Webrunner and Kernel
Scale down the webrunner and kernel to prepare for updates.
8.3 Navigate to Windows Update
Go to the Windows Update section.
8.4 Click on Update and Restart Windows
Click to update and restart Windows.
8.5 Verify Pending Updates
Check for any additional pending updates after the restart.
8.6 Ensure Windows is Up-to-date
Ensure that all Windows updates are completed.
8.7 Restart the Kernel and Webrunner
Restart the kernel and webrunner to complete the process.