Database Corruption Recovery
Database corruption can occur in both the SQL database (H2, MySQL, or other) and the NoSQL point value store. This page covers how to identify corruption, recover from it, and prevent future occurrences.
Symptoms
H2 Database Corruption
- Mango fails to start with
org.h2.jdbc.JdbcSQLExceptionerrors in the log. - Error code 50100:
The file is lockedorThe database has been closed. - Error code 90030 or 90117: Indicates corrupt data pages or index inconsistencies.
- Messages like
File corrupted while reading,Block not found, orIOException reading from store. - Mango starts but certain configuration pages are blank or show errors.
- JSON import/export fails with database-level exceptions.
NoSQL Point Value Store Corruption
- Point value queries return no data for specific points or time ranges where data should exist.
- The Mango log shows
IOExceptionerrors related to NoSQL shard files. - The NoSQL corruption scan reports damaged shards.
- Point value history charts show unexpected gaps.
- "NoSQL Data Lost" events appear in the alarm system.
Common Causes
H2 Corruption
- Unclean shutdown: Power failure, forced process kill (
kill -9), or operating system crash while H2 is writing data. - Disk space exhaustion: If the disk fills up during a write operation, H2 may leave files in an inconsistent state.
- File system corruption: Underlying storage issues (bad sectors, RAID degradation, NFS errors) can corrupt the database files.
- Concurrent access: Multiple Mango instances or external tools accessing the same H2 database files simultaneously.
- Incompatible H2 version: Using H2 database files created with a different version of the H2 engine.
NoSQL Corruption
- Unclean shutdown while batch write-behind tasks are flushing data to disk.
- Disk I/O errors during shard file writes.
- Manual deletion of NoSQL shard files while Mango is running.
- Storage device failure causing partial writes to shard files.
Diagnosis
H2 Database
Check the log for H2 errors
grep -i "JdbcSQL\|h2.*corrupt\|h2.*IOException\|h2.*error code" MA_HOME/logs/ma.log
Verify database file integrity
# Check that the database files exist and have reasonable sizes
ls -la MA_HOME/databases/*.mv.db
ls -la MA_HOME/databases/*.trace.db
# Check for lock files (should not exist when Mango is stopped)
ls -la MA_HOME/databases/*.lock.db
If a .lock.db file exists while Mango is not running, the previous shutdown was not clean.
Try to open the database with H2's recovery tool
# From the MA_HOME directory, use the H2 JAR to attempt recovery
java -cp lib/h2-*.jar org.h2.tools.Recover -dir databases -db mah2
This generates a SQL script (mah2.h2.sql) that can be used to rebuild the database.
NoSQL Point Value Store
Run a corruption scan
If Mango is running, use the NoSQL module's Corruption Scan feature:
- Navigate to the NoSQL settings page (System Settings or the module's dedicated page).
- Configure the corruption scan task threads (more threads = faster scan, but higher resource usage).
- Start the corruption scan.
- Review the results for damaged shards.
Check for NoSQL data lost events
Review the Events page for "NoSQL Data Lost" events. These are raised whenever a batch write fails to persist data to disk.
Inspect shard files manually
# NoSQL data is stored in shard files organized by point ID
ls -la MA_HOME/databases/mangoTSDB/
# Check for zero-length files (likely corrupt)
find MA_HOME/databases/mangoTSDB/ -empty -type f
Solutions
H2 Recovery: Option 1 -- Restore from Backup
The fastest and most reliable recovery method is restoring from a recent backup.
- Stop Mango completely.
- Locate your backup files. By default, Mango stores SQL database backups in
MA_HOME/backup/. - Back up the current (corrupt) database before restoring, in case you need it later:
mkdir MA_HOME/databases/corrupt_backup
cp MA_HOME/databases/mah2.mv.db MA_HOME/databases/corrupt_backup/ - Restore the backup:
cp MA_HOME/backup/latest_backup/mah2.mv.db MA_HOME/databases/ - Start Mango. Configuration changes made after the backup timestamp will be lost, but the system should be functional.
H2 Recovery: Option 2 -- Use H2 Recovery Tool
If no backup is available, attempt to recover data using H2's built-in recovery tool:
- Stop Mango.
- Run the H2 recovery tool:
java -cp MA_HOME/lib/h2-*.jar org.h2.tools.Recover -dir MA_HOME/databases -db mah2 - This generates
mah2.h2.sqlcontaining the recoverable data as SQL statements. - Rename or remove the corrupt database:
mv MA_HOME/databases/mah2.mv.db MA_HOME/databases/mah2.mv.db.corrupt - Start Mango. It will create a fresh, empty H2 database.
- Import the recovered SQL if needed using the H2 console or SQL data source.
The H2 recovery tool may not recover all data. Tables or rows that were in the corrupted portion of the file may be lost. Always verify the recovered data before relying on it.
H2 Recovery: Option 3 -- Migrate to MySQL
If H2 corruption is a recurring problem, consider migrating to MySQL or MariaDB, which provide better crash recovery through InnoDB's write-ahead logging:
- Start Mango with a fresh H2 database.
- Reconfigure the system using JSON import (if you have a configuration export).
- Change
mango.propertiesto use MySQL (see H2 to MySQL Migration).
NoSQL Recovery: Restore from Backup
The NoSQL module supports both full and incremental backups:
- Navigate to the NoSQL settings page.
- Use the Restore Mango NoSQL Database tool.
- Configure the Source directory (where backup files are stored, default:
MA_HOME/backups/). - Set the Destination directory to the current NoSQL database location to overwrite the current database.
- Select the backup to restore. Incremental backups are listed with
-incremental-in the name. - Enable Incremental Restore to apply incremental backups on top of the selected base backup. Incremental backups are applied in order of their last-modified dates.
- Start the restore.
Restoring a NoSQL backup never removes existing files. It unzips backup files on top of the existing directory structure. This can result in new data being added and potentially fill gaps, but it can also overwrite existing files with different data.
NoSQL Recovery: Rebuild from SQL
If the SQL database is intact but the NoSQL store is corrupted:
- The SQL event database and configuration are unaffected by NoSQL corruption.
- Point value history stored in the NoSQL database may be partially or fully lost.
- Going forward, new point values will be stored correctly.
- If you have data in an external system (historian, MQTT broker, etc.), you can re-import it using the migration tools.
Repair Specific NoSQL Shards
For targeted repair of specific corrupted shards:
- Identify the corrupted shard files from the corruption scan results.
- If you have incremental backups that cover the time period of the corrupted shard, restore just those specific backup files.
- If no backup is available, the data in the corrupted shard is likely unrecoverable. Remove the corrupt shard file (while Mango is stopped) to prevent ongoing errors.
Prevention
For H2 Databases
- Enable automatic SQL backups in System Settings. Configure daily backups with sufficient retention (e.g., keep the last 14 backups).
- Use a UPS (Uninterruptible Power Supply) for the Mango server to prevent unclean shutdowns from power failures.
- Always stop Mango gracefully using
systemctl stopor the UI shutdown button rather thankill -9. - Monitor disk space and set up alerts when the disk approaches capacity. H2 corruption from disk-full conditions is difficult to recover from.
- Consider migrating to MySQL/MariaDB for production systems. These databases have significantly better crash recovery mechanisms than H2.
- Never run multiple Mango instances against the same H2 database files.
For the NoSQL Point Value Store
- Enable automatic NoSQL backups with incremental backup support to minimize backup sizes while maintaining recovery capability.
- Monitor for "NoSQL Data Lost" events and investigate immediately when they occur.
- Ensure adequate disk I/O performance. Slow storage can cause batch write failures that lead to data loss events.
- Adjust NoSQL performance settings if data lost events occur frequently. Increase the max batch write behind tasks and batch sizes to handle higher throughput.
- Run periodic corruption scans (e.g., monthly) to detect issues before they affect data availability.
- Use reliable storage (enterprise SSDs or RAID arrays with battery-backed write cache) for the NoSQL data directory.
Related Pages
- H2 to MySQL Migration — Migrate from H2 to a more robust database to prevent future corruption
- Server Error 500 — Server errors that may be caused by underlying database corruption
- Startup Stuck — Startup failures caused by corrupted database files
- Reporting Bugs — How to report database corruption with proper diagnostic information