HD Doctor Logo

PostgreSQL Data Recovery

Direct answer

PostgreSQL fails from corrupted cluster (PGDATA), lost WAL (Write-Ahead Log), damaged base/oid directory, corrupted pg_xlog/pg_wal, bad major version updates and erroneously forced pg_resetwal. HD Doctor recovers 88% of PostgreSQL cases via pg parser, control file reconstruction and relation extraction. With 24+ years and 180+ PostgreSQL cases.

Critical: do NOT run pg_resetwal on production cluster without backup, do NOT delete files in base/oid without understanding, do NOT pg_upgrade cluster in alert.

How PostgreSQL organizes data

PostgreSQL stores data in PGDATA: base/<dbid>/<oid> (each table is an OID file), pg_wal/ (Write-Ahead Log), global/ (system catalog), pg_xact/ (transaction status). Each table has main fork + FSM (Free Space Map) + VM (Visibility Map). Controlled by XID and LSN (Log Sequence Number). Common failures: truncated WAL, corrupted pg_xact, damaged OID files and erroneously forced pg_resetwal.

Common PostgreSQL symptoms

  • PostgreSQL won't start, "PANIC: could not locate a valid checkpoint record"
  • ERROR: invalid page in block X of relation Y
  • ERROR: could not read block X in file (corrupted cluster)
  • Database in SUSPECT after OS crash
  • WAL files lost or pg_xlog/pg_wal empty
  • pg_resetwal was run erroneously
  • Replication standby out of sync
  • "missing chunk number 0 for toast value" error (TOAST corruption)

Most frequent PostgreSQL causes

Cause%Recoverable?
Page corruption in relation file (OID)30%βœ… Yes, pg parser + page repair
Lost or truncated WAL20%βœ… Yes, base backup + partial WAL
Erroneously forced pg_resetwal15%🟑 Partial, loses XID history
Storage failure under PGDATA12%βœ… Yes, storage recovery first
TOAST corruption10%βœ… Yes, specific TOAST parser
Accidental DROP TABLE/DATABASE8%βœ… Yes, OID file carving
Other (failed pg_upgrade, broken replication)5%βœ… Yes

Source: HD Doctor internal stats on 180 PostgreSQL cases between 2022 and 2025.

What NOT to do with a failing PostgreSQL

  1. 1.
    Do not run pg_resetwal on production cluster. Resets WAL and XID counter, losing transaction history.
  2. 2.
    Do not delete files in base/<dbid>/. Each OID file is a table or index. Deletion is destructive.
  3. 3.
    Do not run VACUUM FULL on cluster in alert. VACUUM FULL rewrites entire relations, amplifies corruption.
  4. 4.
    Do not run pg_dump on SUSPECT database. pg_dump can crash and leave pending locks.
  5. 5.
    Do not pg_upgrade cluster in problem. Upgrade requires consistent cluster; failure creates version mix.
  6. 6.
    Do not force REINDEX on corrupted index without extracting data first. REINDEX assumes relation integrity.

How HD Doctor recovers PostgreSQL

We work on PGDATA copies. For page corruption, technical parser; for lost WAL, checkpoint reconstruction.

  1. 1

    PGDATA intake

    You send entire PGDATA/ or PostgreSQL server drives.

  2. 2

    Diagnosis within 24h

    pg_control analysis, PG version identification, corruption type, WAL status.

  3. 3

    Free written quote with scope

    Technical analysis before approval, listing viable relations.

  4. 4

    Native pg parser

    For page corruption, proprietary parser extracts tuples from intact pages of each relation.

  5. 5

    Control file reconstruction

    When pg_control is corrupted, we rebuild via WAL files and relation LSN analysis.

  6. 6

    Recovery via WAL replay

    When partial WAL exists, we apply controlled replay to last consistent checkpoint.

  7. 7

    Specific TOAST extraction

    For TOAST corruption, TOAST parser extracts chunks individually.

  8. 8

    Data validation

    We compare counts, referential integrity and checksums in test instance.

  9. 9

    Delivery + final report

    Restored database or SQL/CSV pg_dump, signed engineer report.

Turnaround and SLA

ScenarioTurnaround
Page corruption in 1 relation5–10 business days
Truncated WAL, checkpoint recovery7–14 business days
Forced pg_resetwal, history loss10–18 business days
Storage failure + cluster recovery12–22 business days
  • 24h emergency SLA available for production PostgreSQL.
  • No Data, No Charge policy: if we can't recover the critical tables you flagged, you don't pay for the service. Diagnosis is free in 92% of cases.

Versions and environments supported

We service PostgreSQL 9.6, 10, 11, 12, 13, 14, 15, 16, 17. Forks: EnterpriseDB EDB, Citus, TimescaleDB, PostGIS. Configurations: standalone, streaming replication (master-standby), logical replication, Patroni HA, BDR (Bi-Directional Replication). AWS RDS PostgreSQL and Aurora PostgreSQL with snapshots.

Why HD Doctor for PostgreSQL

  • πŸ›οΈ24+ years focused exclusively on data recovery
  • πŸ”¬Class 100 cleanroom + in-house PostgreSQL infrastructure
  • 🧠Native pg parser + control file reconstruction + WAL replay
  • ⚑24h emergency SLA for production PostgreSQL
  • 🀝Only Western Digital Platinum Partner with a regional lab
  • βš–οΈSigned engineer report valid for forensics and insurance

PostgreSQL FAQ

PostgreSQL won't start: "PANIC: could not locate checkpoint". Recoverable?

Yes, in 90% of cases. Usually truncated WAL. We rebuild pg_control and apply controlled replay.

Page corruption in critical table. Any chance?

Yes, in 88% of cases. Native pg parser extracts intact tuples from healthy pages, ignoring corrupted ones.

Ran pg_resetwal and lost everything. Can you recover?

We recover in 70-80% of cases. pg_resetwal resets WAL and XID counter, but relations remain. Pending transaction history is lost but committed data stays.

TOAST corruption in large column. Can you?

Yes. Specific TOAST parser extracts chunks individually, even when pg_toast_<oid> has corrupted pages.

How does the quote work?

Diagnosis is free. After technical assessment within 24h we send a detailed quote.

Do you serve AWS RDS PostgreSQL and Aurora?

For RDS/Aurora, we recover via available snapshots or pg_dump. Catastrophic failures need direct AWS support.

PostgreSQL critical issue? Talk now

Related