# Sanitize Placeholder

Enforces short or sanitized usernames and generates thematic placeholders
after `drush sql:sanitize` has finished. Provides a Drush command to apply
per-field replacement strategies across entities.

- **Config page:** Administration → Configuration → Development →
  **Sanitize Placeholder**
  `/admin/config/development/sanitize-placeholder`
- **Post-sanitize hook:** runs `sp:fake` automatically after
  `sql:sanitize` (scope fixed to **`sanitized`**). It only targets fields
  that Drush actually sanitized and that also match your rules.

## Compatibility

- **Drupal 10.2+** and **Drupal 11**.

## Installation

Install as you would normally install a contributed Drupal module.

Enable the main module:

- `sanitize_placeholder`

Optionally enable extra, country-specific strategies:

- `sanitize_placeholder_extra`

> Optional dependency (for richer, locale-aware data): `fakerphp/faker`.
> Install it on the site if you want localized names or companies:
>
> ```text
> composer require fakerphp/faker:^1.23
> ```
>
> If Faker is not installed, the module falls back to an internal
> algorithm that generates pronounceable fake names and companies
> (deterministically if seeded).

## Configuration

Open the settings form and define rules in **Per-field replacement rules**.

**How rules are applied**

- During `drush sql:sanitize`, the module’s post-hook runs with scope
  **`sanitized`**. It considers only the fields Drush sanitized and that
  also match your configured rules.
- When you run `drush sp:fake` yourself, you can choose the scope:
  `sanitized`, `empty`, or `all`. Using `--scope=all` ignores the
  “sanitized” restriction and rewrites every matching field according to
  your rules.

### Rule format

One rule per line:

```text
entity.bundle.field=strategy
```

**Examples**

```text
user.user.field_first_name=first_name
user.user.field_last_name=last_name
user.user.field_institution=institution
user.user.name=username
node.article.field_company=institution
taxonomy_term.tags.name=pattern
```

### Available strategies (label – machine name)

- First name – `first_name`
- Last name – `last_name`
- Username – `username`
- Institution / Company – `institution`
- Pattern – `pattern`
- Domain – `domain`
- Coordinates (lat,long) – `coords`
- French SIREN (Luhn) – `siren_fr`
- Portuguese license plate (AA-00-AA) – `license_plate_pt`
- Portuguese NIF – `nif`
- Spanish NIF / NIE – `nif_es`
- Portuguese IBAN (PT) – `iban_pt`
- German IBAN (DE) – `iban_de`

> Use the machine name on the right side of each rule.

### General settings

- **Max username length** — hard cap applied by the username generator.
- **Faker locale** — locale code (e.g., `pt_PT`, `fr_FR`). This field is
  disabled if Faker is not installed. When disabled, the algorithmic
  fallback is used and locale has no effect.
- **Deterministic replacements** — when enabled, strategies can seed
  their generators to produce repeatable results.

### Run after `sql:sanitize`

The post-hook runs `sp:fake` automatically at the end of
`drush sql:sanitize`.

- **Scope**: fixed to `sanitized` (only fields that Drush sanitized and
  that also match your rules are processed).
- **Limit per rule**: max entities per rule during the post-hook run.
  Default: `5000` (separate from the CLI default below).

### Form rules field helper

In the settings form, add one rule per line using:

```text
entity.bundle.field=strategy
```

This rules field drives both behaviors:

- On `drush sql:sanitize`, the post-hook runs with scope `sanitized` and
  acts only on the fields that Drush has just sanitized and that match
  the rules above.
- When you run `drush sp:fake`, it acts on all matching rules unless you
  constrain it with `--entity`, `--bundle`, `--field`, or a different
  `--scope`.

## Drush

### Command

```
drush sp:fake [--entity=ENTITY] [--bundle=BUNDLE] [--field=FIELD] [--scope=SCOPE] [--limit=LIMIT] [--seed=SEED]
```

**Options (short explanations):**

- `--entity=ENTITY` — limit to an entity type (e.g., `user`, `node`,
  `taxonomy_term`).
- `--bundle=BUNDLE` — limit to a bundle (e.g., `article`, `page`).
- `--field=FIELD` — limit to one field machine name
  (e.g., `field_first_name`).
- `--scope=SCOPE` — one of `sanitized` (default), `empty`, `all`.
- `--limit=LIMIT` — max entities per rule to process.
  Default: `100` for the CLI command.
- `--seed=SEED` — integer; when set, data generation is deterministic and
  repeatable.

**Examples**

```bash
# Fill only empty first/last names for users, up to 500 users, deterministic:
drush sp:fake --entity=user --field=field_first_name --scope=empty --limit=500 --seed=123
drush sp:fake --entity=user --field=field_last_name --scope=empty --limit=500 --seed=123

# Replace usernames for all users, respecting configured max length:
drush sp:fake --entity=user --field=name --scope=all

# For articles, generate company names for a specific field:
drush sp:fake --entity=node --bundle=article --field=field_company --scope=all --seed=42
```

> The Drush command applies your configured rules and then filters by the
> options above.

## Extra settings (when `sanitize_placeholder_extra` is enabled)

- **Domain strategy**
  - **TLD allowlist** — comma-separated, e.g., `com, org, eu, fr, pt`
  - **Allowed words** — one per line; combined into friendly domains.
- **Patterns** (`pattern` strategy)
  - List of templates, one per line; e.g.:
    - `{letters:3}-{digits:4}`
    - `{company}-{digits:3}`

## Extending: add your own strategy

You can ship strategies from any module. Implement the interface,
register your class as a service, and tag it.

**Example: PHP strategy class (text only)**

```php
<?php

declare(strict_types=1);

namespace Drupal\my_module\Strategy;

use Drupal\Core\Entity\EntityInterface;
use Drupal\Core\Field\FieldDefinitionInterface;
use Drupal\sanitize_placeholder\Service\ThematicFaker;
use Drupal\sanitize_placeholder\Strategy\StrategyInterface;

/**
 * Generates department-like labels.
 */
final class DepartmentStrategy implements StrategyInterface
{
    public function __construct(
        private readonly ThematicFaker $faker,
    ) {}

    public function id(): string
    {
        return 'department';
    }

    public function label(): string
    {
        return 'Department';
    }

    public function generate(
        EntityInterface $entity,
        FieldDefinitionInterface $field,
    ): string {
        if ($this->faker->hasFaker()) {
            return $this->faker->get()->company() . ' Dept';
        }
        // Fallback when Faker is not present.
        return $this->faker->firstName() . ' Division';
    }
}
```

**Example: service definition (text only)**

```yml
services:
  sanitize_placeholder.strategy.department:
    class: Drupal\my_module\Strategy\DepartmentStrategy
    arguments: ['@sanitize_placeholder.thematic_faker']
    tags:
      - { name: 'sanitize_placeholder.strategy' }
```

Enable your module and clear caches. The new strategy appears in the
**Available strategies** list and can be used in rules like:

```text
user.user.field_user_department=department
```

## Notes and tips

- **Faker optionality**: if Faker is missing, you still get varied names
  and companies from the internal generator. Locale applies only when
  Faker is present.
- **Caching**: the module invalidates UI-affecting caches (for example,
  `rendered` and `user_list`) after bulk replacements.

## Troubleshooting

- **“Faker\Factory not found”**: install Faker on the site:
  `composer require fakerphp/faker`. The **Faker locale** field will
  enable automatically.
- **Post-hook did not run or changed nothing**: the post-hook with scope
  `sanitized` only affects fields Drush sanitized and that match your
  rules.
  To force replacements regardless of Drush’s sanitize, run:
  ```bash
  drush sp:fake --scope=all
  ```
  (optionally with `--entity`, `--bundle`, and `--field` to target
  specific fields).

## License

GPL-2.0-or-later
