Monitoring Recommendations

Here are monitoring recommendations to effectively oversee your Jive environment. To implement these recommendations, use a monitoring tool—such as check_MK, Zenoss, Zyrion, IBM/Tivoli, or others—with polling intervals of every five minutes.

warning

If you connect Jive to external resources like an LDAP server, SSO system, SharePoint, or Netapp storage, we recommend establishing monitoring on these external and shared resources. This is especially vital if Jive synchronizes with an LDAP server or authenticates against an SSO, as it helps troubleshoot login issues. We have observed outages due to LDAP server availability problems in our hosted environments.

Monitoring Items

1. All Nodes

What to Monitor:

Memory utilization
CPU load
Disk space
Disk I/O activity
Network traffic
Clock accuracy

Why Monitor: Regular checks on these metrics assist in troubleshooting and ensuring optimal performance:

Memory Utilization: If consistently near 75%, consider increasing memory.
CPU Load: Healthy nodes typically show a load between 0 and 10. Above 5 consistently may require thread dumps using the jive snap command and then opening a support case with Support.
Disk Space: Require sufficient space for search indexes, attachments, images, and binary content caching. The default limit for the binstore cache is 512MB.
Network Traffic: Tracking helps understand traffic patterns and drop-offs.
Clock Accuracy: Ensure clocks are synchronized in clustered environments, preferably using NTP.

2. Jive Web Applications

What to Monitor:

Synthetic health checks using tools such as WebInject both for individual web application servers and through the load balancer's virtual IP address.

Why Monitor: WebInject verifies the application's functionality:

Request login and homepage to check the service status and database connectivity.
Set checks to run every five minutes initially, with adjustments for false alarms.

Example Checks:

WebInject Code Example

3. Cache Server

What to Monitor:

JMX hooks (heap)
Disk space (logs)

Why Monitor:

Heap: Monitor for excessive garbage collection; if consistently near 75%, increase heap size. For details, see Adjusting Java Virtual Machine (JVM) settings.
Ensure sufficient disk space for logging.

4. Databases (Activity Engine, Analytics, and Web Application)

What to Monitor:

Stats for connections, transactions, longest query time, slow queries
Verify ETLs are running
Disk space
Disk I/O activity

Why Monitor: These checks can indicate potential resource issues:

Connections: Monitor the number to manage memory usage appropriately.
Transactions: Measure overall traffic volume, though secondary to CPU and memory usage monitoring.
Queries: Slow queries should be logged and monitored for optimization.
ETLs: Verify running status to ensure data accuracy; check jivedw_etl_job table.
Disk Space: Monitor for minimum availability of 50% on the database server to avoid complications.

5. Document Conversion

What to Monitor:

Tomcat I/O
Heap
Queue statistics (average length and wait times)
Running OpenOffice service statistics
Overall conversion success rate

Why Monitor: Service statistics accessible via JMX help verify conversion processes.

6. Activity Engine

What to Monitor:

Activity Engine service
JMX hooks (heap) and ports
Queue statistics (average length and wait times)

Why Monitor:

Ensure service health via JMX metrics.
Manage memory effectively based on heap usage. For queue details, see Configuring Activity Engine.

Advanced Monitoring Data Points

JMX Data Points

Node	Data Type	JMX Object Name	JMX Attribute Name	Data Point
Jive Web Applications	JVM heap memory	`java.lang:type=Memory`	`HeapMemoryUsage`	`max` / `used`
Cache Server	JVM heap memory	`java.lang:type=Memory`	`HeapMemoryUsage`	`max` / `used`
Activity Engine	JVM heap memory	`java.lang:type=Memory`	`HeapMemoryUsage`	`max` / `used`

PostgreSQL Data Points

Collect PostgreSQL data points for the core application and Activity Engine databases, with an option for Analytics database data:

Query Method	Type	Data Points
`poll_postgres.py` script	Connections	`Total`, `Active`, `Idle`
This script makes one query to the database. The query returns all data points at once.	Locks	`Total`, `Granted`, `Waiting`, `Exclusive`, `Access Exclusive`
	Latencies	`Connection latency`, `SELECT Query latency`
	Tuple Rates	`Returned`, `Fetched`, `Inserted`, `Updated`, `Deleted`

Monitoring your Jive environment

Monitoring Items​

1. All Nodes​

2. Jive Web Applications​

3. Cache Server​

4. Databases (Activity Engine, Analytics, and Web Application)​

5. Document Conversion​

6. Activity Engine​

Advanced Monitoring Data Points​

JMX Data Points​

PostgreSQL Data Points​

Monitoring Items

1. All Nodes

2. Jive Web Applications

3. Cache Server

4. Databases (Activity Engine, Analytics, and Web Application)

5. Document Conversion

6. Activity Engine

Advanced Monitoring Data Points

JMX Data Points

PostgreSQL Data Points