# Fast Revision Purge

Fast, safe tools to **plan** and **purge** old Drupal revisions at scale — with clear estimates, totals, and an admin UI.

> Primary goal: reclaim database space by deleting unneeded **node**, **paragraph**, and **Layout Builder** revision rows without risking live content.

---

## TL;DR

- **Plan (Dry run)** computes **KEEP/DELETE** sets and estimates **Potential space that can be cleaned up**.
- **Run purge** deletes in **chunks**, first paragraphs, then nodes, and records **bytes freed** and revision totals.
- A small **stats** table keeps “**Last Dry Run** / **Last Purged**” timestamps (shown as “_…ago_”) and cumulative totals.
- Clean, single-click **admin UI** + **Drush** commands.

---

## Key Features

- **Dry run planner**
  - Populates working tables:
    - `fastrev_node_keep(vid)`, `fastrev_node_delete(vid)`
    - `fastrev_par_in_use(rid)`, `fastrev_par_delete(rid)`
  - Supports Drupal 10/11 paragraph schema differences by introspection.
  - Calculates a **byte estimate** using `information_schema` average row sizes.
  - Writes `potential_claimable_space` and `last_dryrun_timestamp` into the stats table.

- **Chunked purger**
  - Deletes **paragraph** revision rows first, then **node** revisions (safer for relationships).
  - Deletes from **field revision tables** before **core revision tables**.
  - Estimates **bytes freed** this run and tracks **Layout Builder** revision-field rows deleted.
  - Writes:
    - `total_node_revisions_deleted`
    - `total_para_revisions_deleted`
    - `total_lb_revisions_deleted`
    - `space_freed_last_run`, updates `space_freed_total`
    - `last_purge_timestamp`

- **Admin UI (Settings form)**
  - **Database overview**: current DB size, **Top 3 biggest tables**.
  - **Potential space that can be cleaned up** (from last dry run).
  - **Last Run**: shows **Last Dry Run** and **Last Purged** as “_…ago_”.
  - **Dry run report**: prints counts for **Node KEEP/DELETE**, **Paragraph KEEP/DELETE**, **Layout Builder KEEP/DELETE**.
  - **Retention policy**: keep-latest (per node/per language), since-date, protect latest published, paragraph keep-last.
  - **Execution settings**: chunk size, sleep (ms), optional “ensure indexes”.
  - **Sanity checks**: one-click visibility of copy/paste queries.
  - **Danger zone**: confirmation checkbox, optional extra purges for Paragraphs / Layout Builder via dedicated truncators.
  - **Actions**: Save, Plan (Dry run), Run purge now.

- **Drush**
  - `fastrev:report` (aka `fr:report`) — runs planner, prints KEEP/DELETE + samples, potential space, last-run times.
  - `fastrev:purge` (aka `fr:purge`) — executes purge; prints totals, **LB rows deleted**, and **estimated bytes freed**.
  - `fastrev:reindex` (aka `fr:reindex`) — ensures helpful DB indexes.

---

## Installation

1. Place the module in your codebase and enable it.
2. On install, the module creates a singleton **stats** table: `fastrev_stats` (see schema below).
3. Navigate to **Configuration → Fast Revision Purge** to use the UI _or_ use Drush (see below).

> **Uninstall**: drops `fastrev_stats`. Working tables used during planning may exist only transiently; the purge/logical plan can be re-run after reinstall.

---

## Services Overview

The module exposes the following key services (excerpt from `.services.yml`):

- `fast_revision_purge.planner` — computes KEEP/DELETE sets; updates **stats** after dry run.
- `fast_revision_purge.purger` — performs chunked deletions; updates **stats** after purge.
- `fast_revision_purge.table_map` — discovers revision and `entity_reference_revisions` field tables.
- `fast_revision_purge.index_manager` — ensures performance indexes.
- `fast_revision_purge.table_stats` — database size helpers (DB size, top tables, avg row size).
- `fast_revision_purge.stats` — thin storage wrapper for `fastrev_stats`.

And Drush command service (`drush.services.yml`):

```yaml
services:
  fast_revision_purge.commands:
    class: Drupal\fast_revision_purge\Commands\FastRevCommands
    arguments:
      - '@fast_revision_purge.purger'
      - '@fast_revision_purge.planner'
      - '@fast_revision_purge.index_manager'
      - '@fast_revision_purge.table_map'
      - '@fast_revision_purge.stats'
    tags:
      - { name: drush.command }
```

---

## Stats Table

A single-row table used for dashboard/Drush summaries:

| Column | Meaning |
|---|---|
| `total_node_revisions_deleted` | Cumulative node revision rows deleted. |
| `total_para_revisions_deleted` | Cumulative paragraph revision rows deleted. |
| `total_lb_revisions_deleted` | Cumulative **Layout Builder** field revision rows deleted. |
| `space_freed_last_run` | Estimated bytes freed by the most recent purge run. |
| `space_freed_total` | Cumulative estimated bytes freed across purges. |
| `potential_claimable_space` | Estimated bytes reclaimable based on the latest **dry run plan**. |
| `last_dryrun_timestamp` | UNIX timestamp of last dry run completion. |
| `last_purge_timestamp` | UNIX timestamp of last purge completion. |

> Table name: `fastrev_stats`. The installer inserts row `id=1` on install.

---

## Working Tables (Planner)

- `fastrev_node_keep(vid)` — revisions to **keep**.
- `fastrev_node_delete(vid)` — revisions to **delete** (non-default, not in keep).
- `fastrev_par_in_use(rid)` — paragraph revisions currently **in use** or to keep (current pointers, keep-last M, reachable via ERR graph).
- `fastrev_par_delete(rid)` — paragraph revisions to **delete** (not in use / not kept).

> Optionally, if present in your site’s build process, you may see `fastrev_lb_keep` / `fastrev_lb_delete` (for LB-specific flows). The main purge also tracks rows deleted from `node_revision__layout_builder__layout` when present.

---

## How Estimates Are Computed

### Potential claimable space (after **dry run**)
1. Count rows that **would** be deleted for each impacted table by joining working sets (`fastrev_node_delete`, `fastrev_par_delete`) to revision tables:
   - Node: `node_revision` and every `node_revision__%` field table.
   - Paragraph: `paragraph_revision` and every `paragraph_revision__%` field table.
2. For each table, compute `avg_row_size = (DATA_LENGTH + INDEX_LENGTH) / TABLE_ROWS` from `information_schema.TABLES`.
3. Estimate per-table bytes: `rows_to_delete * avg_row_size`. Sum across tables.
4. Persist to `fastrev_stats.potential_claimable_space` + `last_dryrun_timestamp`.

### Bytes freed (after **purge**)
- During deletion, we track **rows deleted per table** and sum `rows_deleted * avg_row_size(table)` for a best-effort **`space_freed_last_run`** estimate, then increment `space_freed_total`.

> ⚠️ These are estimates. Actual on-disk free space may require engine-specific maintenance (e.g., `OPTIMIZE TABLE` or vacuum-like operations).

---

## Admin UI Walkthrough

1. **Database overview**
   - **Current DB Size**
   - **Top 3 biggest tables**
   - **Last Run**: “Last Dry Run” and “Last Purged” shown as “_…ago_”
   - **Potential space that can be cleaned up** (from last dry run)

2. **Retention policy**
   - Keep latest **N** revisions (per node or **per language**)
   - Keep revisions **since date**
   - **Protect latest published** revision per node
   - Keep last **M** paragraph revisions

3. **Execution settings**
   - Chunk size, Sleep ms
   - Optional **Ensure helpful DB indexes**

4. **Sanity checks**
   - Toggle to display **copy/paste** Drush + SQL sanity queries

5. **Danger zone**
   - Confirmation checkbox
   - Optional **Paragraph** / **Layout Builder** purges via specialized truncators

6. **Actions**
   - **Save configuration**
   - **Plan (Dry run)** (batch)
   - **Run purge now** (batch)

7. **Dry run report**
   - **Node**: KEEP / DELETE
   - **Paragraph**: KEEP / DELETE
   - **Layout Builder**: KEEP / DELETE
   - Sample IDs to delete (first 10 of each, if available)

> Note: Older “Extra purge estimates” UI is removed and replaced by the unified **Potential space** and the concise **Dry run report**.

---

## Drush

### Plan / report

```bash
drush fastrev:report \
  --keep-last=5 \
  --since=2024-01-01 \
  --protect-published \
  --per-language \
  --keep-paragraph-last=1
# Aliases: drush fr:report
```

Prints KEEP/DELETE counts, sample IDs, **Potential space**, and “_Last Dry Run_ / _Last Purged_ (…ago)”.

### Purge

```bash
drush fastrev:purge --chunk=5000 --sleep-ms=50
# Aliases: drush fr:purge
```

Performs chunked deletion, then prints **Node/Paragraph totals**, **Layout Builder rows deleted**, and **Estimated bytes freed** (from last run).

### Indexing

```bash
drush fastrev:reindex
# Aliases: drush fr:reindex
```

Ensures a baseline set of helpful indexes for planning/purging workloads.

---

## Safety & Ordering

- **Paragraphs first**, then **nodes** — prevents dangling references.
- Always remove from **field revision tables first**, then the **core** revision table.
- “Protect latest published” option keeps the latest published node revision per node.
- Paragraphs: keeps current/default pointers and the last **M** revisions per paragraph entity.
- Paragraph nesting via ERR fields is followed via a bounded **BFS** (up to 50 iterations).

---

## Performance Tips

- Use a **moderate chunk size** (e.g., 5k–20k) and **sleep** between chunks on busy sites.
- Run **Plan (Dry run)** soon before purge to keep the **Potential space** estimate fresh.
- Consider off-hours for large purges.
- MySQL/MariaDB: reclaim table space with `OPTIMIZE TABLE` if needed after large deletes.

---

## Troubleshooting

- **No Potential space shown** — run **Plan (Dry run)**; this writes `potential_claimable_space`.
- **Empty “Last Run”** — you haven’t run a dry run or purge on this install yet.
- **Estimates look off** — averages come from `information_schema`. Huge skewed rows or sparse tables can make the estimate conservative/optimistic.
- **Permission errors on `information_schema`** — ensure the DB user can read `information_schema.TABLES`.
- **Paragraphs/LB options disabled** — enable the respective modules.

---

## Extensibility Notes

- `RevisionTableMap` discovers REV/ERR tables at runtime; add your field types as needed.
- `Planner`/`Purger` are small, testable classes. You can extend the estimator set or plug in additional entity types.
- `StatsStorage` is the single place to read/write the `fastrev_stats` row.

---

## File Map (high level)

- `fast_revision_purge.install` — creates `fastrev_stats` (singleton row) on install.
- `src/Service/StatsStorage.php` — read/write stats.
- `src/Service/Planner.php` — working-set planner + potential-space estimator.
- `src/Service/Purger.php` — chunked deletion + freed-bytes estimator.
- `src/Service/RevisionTableMap.php` — revision table discovery.
- `src/Service/TableStats.php` — DB size helpers.
- `src/Form/SettingsForm.php` — admin UI.
- `src/Batch/PlanBatch.php` — batch wrapper for planner.
- `src/Batch/PurgeBatch.php` — batch wrapper for purger.
- `src/Commands/FastRevCommands.php` — Drush commands.
- `drush.services.yml` / `.services.yml` — service wiring.

---

## Disclaimer

This module deletes data. Test your policy and chunk sizes in a **non-production** environment and take backups before large purges.

## License

GPL v2 or later

## 🙏 Author

**Sohaib Mahtab** 
- [Drupal.org Profile](https://www.drupal.org/u/smahtab) 
- [LinkedIn](https://www.linkedin.com/in/smahtab/)