Performance Issues
This section covers performance-related problems in Mango, including high memory consumption, CPU bottlenecks, work item queue backlogs, and slow page load times. Performance issues typically develop gradually and can be diagnosed using Mango's built-in monitoring tools.
Common Performance Issues
-
Out of Memory Errors -- Diagnose and resolve Java OutOfMemoryError in Mango, including heap space tuning, identifying memory-intensive configurations, and recognizing memory leak symptoms.
-
High CPU Usage -- Diagnose and resolve high CPU usage in Mango caused by excessive polling, thread pool exhaustion, garbage collection pressure, and inefficient dashboards.
-
NoSQL Task Queue Full -- Diagnose and resolve NoSQL task queue full errors and data lost events caused by disk I/O bottlenecks, too many data points, or misconfigured flush settings.
Symptoms of Performance Problems
- Mango web pages load slowly or time out.
- The system status page shows high work item queue counts.
- The
ma.logfile containsOutOfMemoryErrorentries. - CPU usage on the Mango server is consistently high.
- Data source polling falls behind schedule, causing stale or missing data.
- The purge process takes excessively long or causes system instability.
Diagnosing Performance Issues
Check System Status
Navigate to Administration > System Status in the Mango UI. This page provides real-time metrics including:
- Work item queues: Shows the number of pending work items for high, medium, and low priority queues. Large backlogs indicate the system cannot keep up with the configured workload.
- Thread pools: Shows active and idle threads. If the pool is fully utilized, new tasks must wait.
- JVM memory: Shows current heap usage and maximum heap size. If usage consistently approaches the maximum, the JVM needs more memory.
- Database metrics: Shows connection pool usage and query performance (if
db.useMetrics=trueis enabled).
Check the Log for Warnings
Search ma.log for performance-related messages:
WARN - Work item queue size ... exceeded threshold
WARN - Failed to poll monitor
ERROR - java.lang.OutOfMemoryError: Java heap space
Monitor System Resources
Use operating system tools to check resource utilization:
# CPU and memory overview (Linux)
top -p $(pgrep -f mango)
# Disk I/O
iostat -x 1
# Disk space
df -h
Common Performance Problems and Solutions
Out of Memory (OOM)
Cause: The JVM heap is too small for the workload, or a memory leak is consuming available heap.
Diagnosis: Check ma.log for OutOfMemoryError. Check the System Status page for JVM memory usage trends.
Solutions:
- Increase the JVM heap size:
# In env.properties or wrapper configuration
wrapper.java.maxmemory=2048 - Reduce the number of data points with in-memory caching. The
Default cache sizesetting on each data point controls how many recent values are held in memory. - Reduce the number of simultaneously enabled data sources.
- If using the "All data" logging type on many points, switch to "On change" or "Interval" logging to reduce the volume of data being processed.
High CPU Usage
Cause: Too many data sources polling at high frequency, a runaway script in a Scripting or Meta data source, or the nightly purge process consuming excessive resources.
Diagnosis: Use OS-level tools to confirm the Mango process is the CPU consumer. Check ma.log for clues about which component is active. Review the System Status work item queues for backlogs.
Solutions:
- Reduce polling frequency on data sources that do not require real-time updates.
- Review Meta and Scripting data sources for inefficient scripts. A script that performs heavy computation on every poll can consume significant CPU.
- If the purge process causes high CPU, consider adjusting purge settings to delete smaller amounts of data more frequently rather than a large purge once per day.
- Disable the OSHI operating system monitoring if it is causing errors on your hardware:
internal.monitor.enableOperatingSystemInfo=false
Work Item Queue Backlogs
Cause: The system cannot process work items as fast as they are generated. This is typically caused by too many data sources, polling too frequently, or insufficient thread pool capacity.
Diagnosis: Check the work item queue counts on the System Status page. If the high-priority queue is consistently growing, the system is overloaded.
Solutions:
- Increase the thread pool sizes:
runtime.realTimeTimer.defaultTaskQueueSize=5000 - Reduce the number of active data sources or lower their polling frequencies.
- Ensure the server hardware has sufficient CPU cores. Each polling data source benefits from available processing threads.
Slow Database Queries
Cause: Large tables without proper indexes, or the database has grown very large due to insufficient purging.
Diagnosis: Enable database query metrics:
db.useMetrics=true
db.metricsThreshold=100
This logs any query that takes longer than 100 milliseconds.
Solutions:
- Review and tighten purge settings to keep the database at a manageable size.
- Run a manual purge from Administration > System Settings > Purge to immediately reduce database size.
- For MySQL/MariaDB, run
OPTIMIZE TABLEon large tables. - Consider migrating from H2 to MySQL/MariaDB for better performance with large datasets.
Purge Process Causes Instability
Cause: The nightly purge (runs at 3:05 AM) deletes a very large number of records at once, causing temporary CPU spikes and potential system instability.
Diagnosis: Check ma.log for purge timing and record counts. If the purge deletes millions of records, it will consume significant resources.
Solutions:
- Run manual purges during maintenance windows to reduce the amount the nightly purge must process.
- Tighten purge settings so less data accumulates between purge cycles.
- For points that generate very high volumes of data, use "Interval" or "On change" logging instead of "All data" to reduce the total number of stored values.
Performance Tuning Best Practices
- Right-size the JVM heap: Allocate enough memory for your workload, but not so much that garbage collection pauses become long. A good starting point is 1-2 GB for small installations and 4-8 GB for large ones.
- Use appropriate logging types: Choose "On change" or "Interval" logging instead of "All data" unless every single poll result is needed.
- Set realistic purge periods: Keep only the data you need. Shorter purge periods mean smaller databases and faster queries.
- Monitor regularly: Check the System Status page periodically and set up internal point monitoring for key metrics (JVM memory, queue sizes, CPU load).
- Use an external database for large installations: H2 is convenient but MySQL/MariaDB scales better for deployments with many data points or long retention periods.