# Spam Referrer Detection

The Visitors module includes spam referrer detection to filter out referral traffic from known spam sites, helping maintain clean analytics data.

## Overview

The module detects and filters referrer spam - when visitors come to your site from known spam domains. This helps keep your referrer reports clean and prevents spam sites from polluting your analytics data.

## How It Works

### Spam Site Database
The module maintains a comprehensive database of known spam referrer sites in `config/install/visitors.spam.yml`:
- **2,300+ known spam domains** included by default
- **Automatic domain matching** including subdomains
- **Regular updates** with new spam sites

### Detection Process
1. **Referrer Analysis**: When a visitor arrives, the module checks the referrer URL
2. **Domain Extraction**: Extracts the domain from the referrer URL
3. **Spam Matching**: Compares against the spam site database
4. **Classification**: Marks spam referrers with `referer_type = 'spam'` in the database

## Configuration

### Spam Detection Settings
The spam detection is automatically enabled and works in the background. There are no user-configurable settings for basic spam detection.

### Viewing Spam Detection
- **Referrer Reports**: Spam referrers are marked as "Spam" in referrer analytics
- **Data Classification**: Spam referrers are still stored but labeled for identification
- **Background Processing**: Detection happens automatically during data processing

## Spam Site Database

### Included Spam Sites
The module includes detection for common spam referrer categories:
- **SEO spam sites**: Fake SEO traffic generators
- **Casino/gambling spam**: Gambling referrer spam
- **Malware sites**: Known malicious referrers
- **Bot networks**: Automated spam referrers
- **Adult content spam**: Adult content referrer spam

### Example Spam Sites (partial list)
```yaml
# From visitors.spam.yml
sites:
  - 0-0.fr
  - 100dollars-seo.com
  - 1-best-seo.com
  - casino-spam-site.com
  - fake-traffic.net
  # ... 2,300+ more domains
```

## Technical Implementation

### SpamService
The `SpamService` handles spam detection:

```php
// Check if referrer is spam
$spam_service = \Drupal::service('visitors.spam');
$is_spam = $spam_service->match($referrer_domain);
```

### Integration Points
- **Referrer classification** in visitor tracking
- **Views integration** with RefererType field showing "Spam" label
- **Static configuration** via YAML file

## How Classification Works

### Referrer Types
When processing visitor data, referrers are classified into these types:

```php
// From VisitorsController.php
if (empty($referrer_url)) {
  $referer_type = 'direct';           // No referrer
} elseif ($same_host) {
  $referer_type = 'internal';         // Internal site links
} elseif ($spam_service->match($domain)) {
  $referer_type = 'spam';             // Spam referrer sites
} elseif ($search_engine_match) {
  $referer_type = 'search_engine';    // Google, Bing, etc.
} else {
  $referer_type = 'website';          // Other legitimate websites
}
```

### Data Storage
- **All referrers are stored** in the database
- **Spam referrers** get `referer_type = 'spam'`
- **Views and reports** can filter or display spam separately
- **RefererType field** shows "Spam" label in Views

## Benefits

### Clean Analytics
- **Spam identification**: Spam referrers are clearly labeled as "Spam"
- **Better insights**: Easy identification of legitimate vs spam traffic sources
- **Data transparency**: All referrers are preserved but properly categorized

### Automatic Maintenance
- **No configuration needed**: Works out of the box
- **Regular updates**: Spam database updated with module releases
- **Background processing**: No impact on site performance

## Limitations

### What It Does NOT Do
- **General spam protection**: Only detects referrer spam
- **Comment spam**: Not related to comment or form spam
- **Bot detection**: Separate from general bot/crawler detection
- **IP blocking**: Does not block IP addresses

### Scope
- **Referrer-only**: Only analyzes referrer URLs
- **Domain-based**: Matches domains, not full URLs
- **Passive filtering**: Filters data, doesn't block access

For general site security and spam protection, use dedicated security modules alongside Visitors.
