Skip to main content

Database Corruption Recovery

Database corruption can occur in both the SQL database (H2, MySQL, or other) and the NoSQL point value store. This page covers how to identify corruption, recover from it, and prevent future occurrences.

Symptoms

H2 Database Corruption

  • Mango fails to start with org.h2.jdbc.JdbcSQLException errors in the log.
  • Error code 50100: The file is locked or The database has been closed.
  • Error code 90030 or 90117: Indicates corrupt data pages or index inconsistencies.
  • Messages like File corrupted while reading, Block not found, or IOException reading from store.
  • Mango starts but certain configuration pages are blank or show errors.
  • JSON import/export fails with database-level exceptions.

NoSQL Point Value Store Corruption

  • Point value queries return no data for specific points or time ranges where data should exist.
  • The Mango log shows IOException errors related to NoSQL shard files.
  • The NoSQL corruption scan reports damaged shards.
  • Point value history charts show unexpected gaps.
  • "NoSQL Data Lost" events appear in the alarm system.

Common Causes

H2 Corruption

  1. Unclean shutdown: Power failure, forced process kill (kill -9), or operating system crash while H2 is writing data.
  2. Disk space exhaustion: If the disk fills up during a write operation, H2 may leave files in an inconsistent state.
  3. File system corruption: Underlying storage issues (bad sectors, RAID degradation, NFS errors) can corrupt the database files.
  4. Concurrent access: Multiple Mango instances or external tools accessing the same H2 database files simultaneously.
  5. Incompatible H2 version: Using H2 database files created with a different version of the H2 engine.

NoSQL Corruption

  1. Unclean shutdown while batch write-behind tasks are flushing data to disk.
  2. Disk I/O errors during shard file writes.
  3. Manual deletion of NoSQL shard files while Mango is running.
  4. Storage device failure causing partial writes to shard files.

Diagnosis

H2 Database

Check the log for H2 errors

grep -i "JdbcSQL\|h2.*corrupt\|h2.*IOException\|h2.*error code" MA_HOME/logs/ma.log

Verify database file integrity

# Check that the database files exist and have reasonable sizes
ls -la MA_HOME/databases/*.mv.db
ls -la MA_HOME/databases/*.trace.db

# Check for lock files (should not exist when Mango is stopped)
ls -la MA_HOME/databases/*.lock.db

If a .lock.db file exists while Mango is not running, the previous shutdown was not clean.

Try to open the database with H2's recovery tool

# From the MA_HOME directory, use the H2 JAR to attempt recovery
java -cp lib/h2-*.jar org.h2.tools.Recover -dir databases -db mah2

This generates a SQL script (mah2.h2.sql) that can be used to rebuild the database.

NoSQL Point Value Store

Run a corruption scan

If Mango is running, use the NoSQL module's Corruption Scan feature:

  1. Navigate to the NoSQL settings page (System Settings or the module's dedicated page).
  2. Configure the corruption scan task threads (more threads = faster scan, but higher resource usage).
  3. Start the corruption scan.
  4. Review the results for damaged shards.

Check for NoSQL data lost events

Review the Events page for "NoSQL Data Lost" events. These are raised whenever a batch write fails to persist data to disk.

Inspect shard files manually

# NoSQL data is stored in shard files organized by point ID
ls -la MA_HOME/databases/mangoTSDB/

# Check for zero-length files (likely corrupt)
find MA_HOME/databases/mangoTSDB/ -empty -type f

Solutions

H2 Recovery: Option 1 -- Restore from Backup

The fastest and most reliable recovery method is restoring from a recent backup.

  1. Stop Mango completely.
  2. Locate your backup files. By default, Mango stores SQL database backups in MA_HOME/backup/.
  3. Back up the current (corrupt) database before restoring, in case you need it later:
    mkdir MA_HOME/databases/corrupt_backup
    cp MA_HOME/databases/mah2.mv.db MA_HOME/databases/corrupt_backup/
  4. Restore the backup:
    cp MA_HOME/backup/latest_backup/mah2.mv.db MA_HOME/databases/
  5. Start Mango. Configuration changes made after the backup timestamp will be lost, but the system should be functional.

H2 Recovery: Option 2 -- Use H2 Recovery Tool

If no backup is available, attempt to recover data using H2's built-in recovery tool:

  1. Stop Mango.
  2. Run the H2 recovery tool:
    java -cp MA_HOME/lib/h2-*.jar org.h2.tools.Recover -dir MA_HOME/databases -db mah2
  3. This generates mah2.h2.sql containing the recoverable data as SQL statements.
  4. Rename or remove the corrupt database:
    mv MA_HOME/databases/mah2.mv.db MA_HOME/databases/mah2.mv.db.corrupt
  5. Start Mango. It will create a fresh, empty H2 database.
  6. Import the recovered SQL if needed using the H2 console or SQL data source.
warning

The H2 recovery tool may not recover all data. Tables or rows that were in the corrupted portion of the file may be lost. Always verify the recovered data before relying on it.

H2 Recovery: Option 3 -- Migrate to MySQL

If H2 corruption is a recurring problem, consider migrating to MySQL or MariaDB, which provide better crash recovery through InnoDB's write-ahead logging:

  1. Start Mango with a fresh H2 database.
  2. Reconfigure the system using JSON import (if you have a configuration export).
  3. Change mango.properties to use MySQL (see H2 to MySQL Migration).

NoSQL Recovery: Restore from Backup

The NoSQL module supports both full and incremental backups:

  1. Navigate to the NoSQL settings page.
  2. Use the Restore Mango NoSQL Database tool.
  3. Configure the Source directory (where backup files are stored, default: MA_HOME/backups/).
  4. Set the Destination directory to the current NoSQL database location to overwrite the current database.
  5. Select the backup to restore. Incremental backups are listed with -incremental- in the name.
  6. Enable Incremental Restore to apply incremental backups on top of the selected base backup. Incremental backups are applied in order of their last-modified dates.
  7. Start the restore.
note

Restoring a NoSQL backup never removes existing files. It unzips backup files on top of the existing directory structure. This can result in new data being added and potentially fill gaps, but it can also overwrite existing files with different data.

NoSQL Recovery: Rebuild from SQL

If the SQL database is intact but the NoSQL store is corrupted:

  1. The SQL event database and configuration are unaffected by NoSQL corruption.
  2. Point value history stored in the NoSQL database may be partially or fully lost.
  3. Going forward, new point values will be stored correctly.
  4. If you have data in an external system (historian, MQTT broker, etc.), you can re-import it using the migration tools.

Repair Specific NoSQL Shards

For targeted repair of specific corrupted shards:

  1. Identify the corrupted shard files from the corruption scan results.
  2. If you have incremental backups that cover the time period of the corrupted shard, restore just those specific backup files.
  3. If no backup is available, the data in the corrupted shard is likely unrecoverable. Remove the corrupt shard file (while Mango is stopped) to prevent ongoing errors.

Prevention

For H2 Databases

  • Enable automatic SQL backups in System Settings. Configure daily backups with sufficient retention (e.g., keep the last 14 backups).
  • Use a UPS (Uninterruptible Power Supply) for the Mango server to prevent unclean shutdowns from power failures.
  • Always stop Mango gracefully using systemctl stop or the UI shutdown button rather than kill -9.
  • Monitor disk space and set up alerts when the disk approaches capacity. H2 corruption from disk-full conditions is difficult to recover from.
  • Consider migrating to MySQL/MariaDB for production systems. These databases have significantly better crash recovery mechanisms than H2.
  • Never run multiple Mango instances against the same H2 database files.

For the NoSQL Point Value Store

  • Enable automatic NoSQL backups with incremental backup support to minimize backup sizes while maintaining recovery capability.
  • Monitor for "NoSQL Data Lost" events and investigate immediately when they occur.
  • Ensure adequate disk I/O performance. Slow storage can cause batch write failures that lead to data loss events.
  • Adjust NoSQL performance settings if data lost events occur frequently. Increase the max batch write behind tasks and batch sizes to handle higher throughput.
  • Run periodic corruption scans (e.g., monthly) to detect issues before they affect data availability.
  • Use reliable storage (enterprise SSDs or RAID arrays with battery-backed write cache) for the NoSQL data directory.
  • H2 to MySQL Migration — Migrate from H2 to a more robust database to prevent future corruption
  • Server Error 500 — Server errors that may be caused by underlying database corruption
  • Startup Stuck — Startup failures caused by corrupted database files
  • Reporting Bugs — How to report database corruption with proper diagnostic information