# Dify Search API

This submodule provides complete Search API integration for Dify, enabling automatic indexing of Drupal content to Dify knowledge bases.

## Features

- **Dify Search API Backend**: Index Drupal content to Dify knowledge base
- **Multiple Indexing Modes**: Support for both text and CSV file indexing
- **Automatic Metadata Handling**: Automatically adds content URLs as metadata for source linking
- **ArrayToCsv Trait**: Utility for converting data arrays to CSV format
- **Secure Configuration**: Credentials stored securely using Drupal state
- **Batch Processing**: Efficient handling of large content volumes

## How It Works

The Search API backend integrates with Dify's knowledge base API to:

1. **Content Processing**: Extracts and processes content from Drupal entities
2. **Format Conversion**: Converts content to text or CSV format based on configuration
3. **Metadata Addition**: Automatically adds content URLs for source attribution
4. **API Communication**: Sends processed content to Dify knowledge base via REST API
5. **Document Management**: Handles creation, updates, and deletion of documents in Dify

## Installation

This submodule is automatically available when the parent Dify module is installed. Enable it via:

```bash
drush pm-enable dify_search_api
```

Or through the Drupal admin interface at **Extend** (`/admin/modules`).

## Configuration

### 1. Create Search API Server

1. Navigate to **Configuration > Search and metadata > Search API** (`/admin/config/search/search-api`)
2. Click **Add server**
3. Choose **Dify Backend** as the backend
4. Configure the server settings

### 2. Configure Dify Credentials

In the server configuration form, provide:

- **Base URL**: Your Dify knowledge base API URL
- **API Key**: Your Dify API key for knowledge base operations
- **Dataset ID**: The target dataset identifier in Dify

### 3. Configure Indexing Options

- **Indexing Mode**: Choose between:
  - **Text Mode**: Sends content as plain text (recommended for most use cases)
  - **File Mode**: Uploads content as CSV files
- **Metadata Options**: Enable automatic URL metadata addition

### 4. Create Search API Index

1. Create a new Search API index
2. Select your Dify server as the backend
3. Add content types to index
4. Configure fields to include:
   - **Title**: Content title
   - **Body**: Main content body
   - **Summary**: Content summary/excerpt
   - **Tags/Keywords**: Taxonomy terms or keywords
   - **URL**: Content URL (automatically added as metadata)

### 5. Index Content

- **Manual Indexing**: Use the Search API interface to manually index content
- **Automatic Indexing**: Content is automatically indexed when created or updated
- **Batch Indexing**: Use Drush commands for bulk indexing

## Usage Examples

### Basic Content Indexing

```bash
# Index all content
drush search-api:index

# Index specific index
drush search-api:index your_index_name

# Clear and reindex
drush search-api:clear your_index_name
drush search-api:index your_index_name
```

### Monitoring Indexing Status

```bash
# Check indexing status
drush search-api:status

# View server information
drush search-api:server-info your_server_name
```

## Technical Implementation

### Backend Plugin

The `SearchApiDifyBackend` plugin (`@SearchApiBackend` annotation) provides:

- **Configuration Form**: Server configuration interface
- **Indexing Methods**: Content processing and API communication
- **Document Management**: CRUD operations for Dify documents
- **Error Handling**: Robust error handling and logging

### ArrayToCsv Trait

Utility trait for CSV conversion with features:

- **Array to CSV Conversion**: Converts associative arrays to CSV format
- **Header Management**: Automatic CSV header generation
- **Encoding Handling**: Proper UTF-8 encoding support
- **Delimiter Configuration**: Configurable CSV delimiters

### API Integration

The backend uses the shared `DifyClient` service for:

- **Document Creation**: Creating new documents in Dify
- **Document Updates**: Updating existing documents
- **Document Deletion**: Removing documents from Dify
- **Batch Operations**: Efficient batch processing

## Indexing Modes

### Text Mode (Recommended)

- **Format**: Plain text content
- **Advantages**: Simple, fast, direct integration
- **Use Cases**: Most content types, articles, pages
- **Metadata**: URLs added as document metadata

### File Mode

- **Format**: CSV files uploaded to Dify
- **Advantages**: Structured data, bulk operations
- **Use Cases**: Large datasets, structured content
- **Metadata**: Embedded within CSV structure

## Troubleshooting

### Common Issues

1. **Authentication Errors**
   - Verify API key is correct
   - Check base URL accessibility
   - Ensure dataset ID exists

2. **Indexing Failures**
   - Check Drupal logs for error messages
   - Verify content permissions
   - Test API connectivity

3. **Performance Issues**
   - Consider using file mode for large volumes
   - Adjust batch sizes in configuration
   - Monitor server resources

### Debugging

Enable verbose logging by:
1. Setting Drupal logging level to "Debug"
2. Monitoring `/admin/reports/dblog`
3. Checking Search API status pages

## Dependencies

- **Parent Module**: `dify:dify` (main Dify module)
- **Search API**: `search_api:search_api` (Drupal Search API module)
- **Drupal Core**: Compatible with Drupal 9, 10, and 11

## Migration Notes

This submodule was created by moving the following components from the main `dify` module:

- `src/Plugin/search_api/backend/SearchApiDifyBackend.php`
- `src/Traits/ArrayToCsvTrait.php`
- Associated Search API configurations

Existing configurations are automatically updated to use the new submodule structure.

## API Reference

### Backend Configuration

```php
$backend_config = [
  'base_url' => 'https://your-dify-api.com',
  'api_key' => 'your-api-key',
  'dataset_id' => 'your-dataset-id',
  'indexing_mode' => 'text', // or 'file'
  'metadata_enabled' => TRUE,
];
```

### Indexing Process

The indexing process follows these steps:

1. **Content Extraction**: Extract field values from Drupal entities
2. **Content Processing**: Process and format content based on indexing mode
3. **Metadata Addition**: Add URL and other metadata
4. **API Communication**: Send to Dify via REST API
5. **Response Handling**: Process API responses and handle errors

## Performance Considerations

- **Batch Size**: Configure appropriate batch sizes for your server capacity
- **Rate Limiting**: Respect Dify API rate limits
- **Memory Usage**: Monitor memory usage during large indexing operations
- **Network Latency**: Consider network latency to Dify service

## Security Considerations

- **API Key Storage**: Keys stored securely in Drupal state (never exported)
- **Content Access**: Respects Drupal's content access controls
- **HTTPS Communication**: Use HTTPS for all API communications
- **Data Privacy**: Consider data privacy implications when indexing content
