# File MIME Type Enforcer

A Drupal module that provides enhanced security for file uploads by comparing MIME type detection results from multiple methods.

## Key Features

- **Dual MIME Detection**: Compares Drupal's extension-based detection with Symfony's content-based analysis
- **Security Protection**: Blocks files where extension doesn't match actual content (e.g., PHP files renamed as .jpg)
- **Configurable Alternatives**: Define acceptable MIME type variations per file extension
- **Flexible Validation**: Strict mode (reject mismatches) or permissive mode (log only)
- **Audit Command**: Scan existing files for MIME type discrepancies

## Installation & Configuration

1. Enable the module: `drush en file_mime_type_enforcer` or at `/admin/modules`
2. Configure at: `/admin/config/media/file-mime-type-enforcer`

### Basic Settings
- **Enable enforcement**: Toggle validation on/off
- **Strict mode**: Reject files with MIME mismatches
- **Log violations**: Record mismatches for monitoring

### Alternative MIME Mappings
Handle legitimate cases where detection methods differ:

```json
{
  "jpg": ["image/jpeg", "image/pjpeg"],
  "pdf": ["application/pdf", "application/x-pdf"], 
  "docx": ["application/vnd.openxmlformats-officedocument.wordprocessingml.document", "application/zip"]
}
```

## Security Benefits

Prevents common attack vectors:
- **Extension spoofing**: Malicious files with misleading extensions
- **Content mismatch**: Files where extension doesn't match content
- **Example**: PHP file renamed to .jpg → Extension says image, content says PHP → **BLOCKED**

## Drush Commands

### File Audit

Queue-based file audit system that allows resumable processing:

```bash
# Basic audit
drush file-mime-type-enforcer:audit
# or
drush fmte:audit

# Show only files with errors
drush fmte:audit --display-errors

# Reset existing queue and repopulate
drush fmte:audit --reset

# Limit processing and reset queue
drush fmte:audit --limit=100 --display-errors --reset
```

#### Command Options

| Option | Description | Default |
|--------|-------------|---------|
| `--limit` | Limit files to process | All files |
| `--display-errors` | Show only problematic files | FALSE |
| `--reset` | Reset the queue and repopulate it | FALSE |

#### Queue-Based Processing Benefits
- **Resumable**: If interrupted, restart without losing progress
- **Efficient**: Process large datasets without memory issues  
- **Flexible**: Resume with different options (e.g., enable --display-errors)

#### Sample Output

```
Populating queue with file IDs...
Successfully queued 1,234 files for processing.
Starting file MIME type audit from queue...
Processing 1,234 files from queue...
Progress: 1200/1234 (97.2%) - Queue remaining: 34
Progress: 1234/1234 (100.0%) - Queue remaining: 0

AUDIT COMPLETE
Total Files: 1,234
Processed: 1,230
MIME Matches: 1,180
MIME Discrepancies: 50
Validation Passed: 1,200
Validation Failed: 30

Some files failed validation. Check logs for details.
```

**Results Explained:**
- **MIME Matches**: Stored MIME matches expected type
- **MIME Discrepancies**: Mismatched types (may pass with alternatives)
- **Validation Failed**: Files that would be rejected if uploaded today

## Requirements

- **PHP fileinfo extension** (required)
- **Drupal 9 or 10**
- Core modules: `file`, `system`

## Testing

1. Enable module and configure settings
2. Upload test files (normal images, renamed executables)
3. Check validation results in logs: `/admin/reports/dblog`

## Troubleshooting

**fileinfo extension missing**: Check with `php -m | grep fileinfo`

**Too strict/permissive**: Adjust alternative MIME mappings or toggle strict mode

**Performance**: Queue-based processing is memory efficient for large datasets

## Maintainers

* Kevin Robinson
* Brittany Huntzberry
* Ian Sholtys