
PostgreSQL Data Recovery
Direct answer
PostgreSQL fails from corrupted cluster (PGDATA), lost WAL (Write-Ahead Log), damaged base/oid directory, corrupted pg_xlog/pg_wal, bad major version updates and erroneously forced pg_resetwal. HD Doctor recovers 88% of PostgreSQL cases via pg parser, control file reconstruction and relation extraction. With 24+ years and 180+ PostgreSQL cases.
Critical: do NOT run pg_resetwal on production cluster without backup, do NOT delete files in base/oid without understanding, do NOT pg_upgrade cluster in alert.
How PostgreSQL organizes data
PostgreSQL stores data in PGDATA: base/<dbid>/<oid> (each table is an OID file), pg_wal/ (Write-Ahead Log), global/ (system catalog), pg_xact/ (transaction status). Each table has main fork + FSM (Free Space Map) + VM (Visibility Map). Controlled by XID and LSN (Log Sequence Number). Common failures: truncated WAL, corrupted pg_xact, damaged OID files and erroneously forced pg_resetwal.
Common PostgreSQL symptoms
- PostgreSQL won't start, "PANIC: could not locate a valid checkpoint record"
- ERROR: invalid page in block X of relation Y
- ERROR: could not read block X in file (corrupted cluster)
- Database in SUSPECT after OS crash
- WAL files lost or pg_xlog/pg_wal empty
- pg_resetwal was run erroneously
- Replication standby out of sync
- "missing chunk number 0 for toast value" error (TOAST corruption)
Most frequent PostgreSQL causes
| Cause | % | Recoverable? |
|---|---|---|
| Page corruption in relation file (OID) | 30% | β Yes, pg parser + page repair |
| Lost or truncated WAL | 20% | β Yes, base backup + partial WAL |
| Erroneously forced pg_resetwal | 15% | π‘ Partial, loses XID history |
| Storage failure under PGDATA | 12% | β Yes, storage recovery first |
| TOAST corruption | 10% | β Yes, specific TOAST parser |
| Accidental DROP TABLE/DATABASE | 8% | β Yes, OID file carving |
| Other (failed pg_upgrade, broken replication) | 5% | β Yes |
Source: HD Doctor internal stats on 180 PostgreSQL cases between 2022 and 2025.
What NOT to do with a failing PostgreSQL
- 1.Do not run pg_resetwal on production cluster. Resets WAL and XID counter, losing transaction history.
- 2.Do not delete files in base/<dbid>/. Each OID file is a table or index. Deletion is destructive.
- 3.Do not run VACUUM FULL on cluster in alert. VACUUM FULL rewrites entire relations, amplifies corruption.
- 4.Do not run pg_dump on SUSPECT database. pg_dump can crash and leave pending locks.
- 5.Do not pg_upgrade cluster in problem. Upgrade requires consistent cluster; failure creates version mix.
- 6.Do not force REINDEX on corrupted index without extracting data first. REINDEX assumes relation integrity.
How HD Doctor recovers PostgreSQL
We work on PGDATA copies. For page corruption, technical parser; for lost WAL, checkpoint reconstruction.
- 1
PGDATA intake
You send entire PGDATA/ or PostgreSQL server drives.
- 2
Diagnosis within 24h
pg_control analysis, PG version identification, corruption type, WAL status.
- 3
Free written quote with scope
Technical analysis before approval, listing viable relations.
- 4
Native pg parser
For page corruption, proprietary parser extracts tuples from intact pages of each relation.
- 5
Control file reconstruction
When pg_control is corrupted, we rebuild via WAL files and relation LSN analysis.
- 6
Recovery via WAL replay
When partial WAL exists, we apply controlled replay to last consistent checkpoint.
- 7
Specific TOAST extraction
For TOAST corruption, TOAST parser extracts chunks individually.
- 8
Data validation
We compare counts, referential integrity and checksums in test instance.
- 9
Delivery + final report
Restored database or SQL/CSV pg_dump, signed engineer report.
Turnaround and SLA
| Scenario | Turnaround |
|---|---|
| Page corruption in 1 relation | 5β10 business days |
| Truncated WAL, checkpoint recovery | 7β14 business days |
| Forced pg_resetwal, history loss | 10β18 business days |
| Storage failure + cluster recovery | 12β22 business days |
- 24h emergency SLA available for production PostgreSQL.
- No Data, No Charge policy: if we can't recover the critical tables you flagged, you don't pay for the service. Diagnosis is free in 92% of cases.
Versions and environments supported
We service PostgreSQL 9.6, 10, 11, 12, 13, 14, 15, 16, 17. Forks: EnterpriseDB EDB, Citus, TimescaleDB, PostGIS. Configurations: standalone, streaming replication (master-standby), logical replication, Patroni HA, BDR (Bi-Directional Replication). AWS RDS PostgreSQL and Aurora PostgreSQL with snapshots.
Why HD Doctor for PostgreSQL
- ποΈ24+ years focused exclusively on data recovery
- π¬Class 100 cleanroom + in-house PostgreSQL infrastructure
- π§ Native pg parser + control file reconstruction + WAL replay
- β‘24h emergency SLA for production PostgreSQL
- π€Only Western Digital Platinum Partner with a regional lab
- βοΈSigned engineer report valid for forensics and insurance
PostgreSQL FAQ
PostgreSQL won't start: "PANIC: could not locate checkpoint". Recoverable?
Yes, in 90% of cases. Usually truncated WAL. We rebuild pg_control and apply controlled replay.
Page corruption in critical table. Any chance?
Yes, in 88% of cases. Native pg parser extracts intact tuples from healthy pages, ignoring corrupted ones.
Ran pg_resetwal and lost everything. Can you recover?
We recover in 70-80% of cases. pg_resetwal resets WAL and XID counter, but relations remain. Pending transaction history is lost but committed data stays.
TOAST corruption in large column. Can you?
Yes. Specific TOAST parser extracts chunks individually, even when pg_toast_<oid> has corrupted pages.
How does the quote work?
Diagnosis is free. After technical assessment within 24h we send a detailed quote.
Do you serve AWS RDS PostgreSQL and Aurora?
For RDS/Aurora, we recover via available snapshots or pg_dump. Catastrophic failures need direct AWS support.