# RAIL Ethical Dimensions Guardrail

## Overview

The RAIL Ethical Dimensions Guardrail provides comprehensive ethical AI evaluation using RAIL's 8 Dimensions of Ethical AI framework. This guardrail integrates seamlessly with the Drupal AI module to evaluate content across multiple ethical dimensions with configurable weights and thresholds.

## Key Features

- **8 Ethical Dimensions**: Evaluate content across Fairness, Safety, Reliability, Transparency, Privacy, Accountability, Inclusivity, and User Impact
- **Weighted Scoring**: Configure individual weights for each dimension to prioritize what matters most for your use case
- **Critical Dimensions**: Safety and Privacy dimensions always stop processing on violation, regardless of other settings
- **Flexible Actions**: Choose to stop processing, request rewrites, or log warnings when violations are detected
- **Configuration Presets**: Quick setup with Strict, Balanced, or Permissive modes
- **Real-time Validation**: Immediate feedback on configuration changes with helpful recommendations

## Configuration Guide

### API Settings

1. **Endpoint URL**: Enter your RAIL API endpoint URL (must use HTTPS)
   - Example: `https://api.rail.com/v1/evaluate`
   - Contact RAIL support for your specific endpoint

2. **API Key**: Your RAIL API authentication key
   - Obtain from your RAIL account dashboard
   - Stored securely using Drupal's configuration system

3. **Timeout**: Maximum time to wait for API responses (10-300 seconds)
   - Recommended: 60 seconds for most use cases
   - Higher values for complex evaluations

### Evaluation Settings

1. **Evaluation Mode**: When to perform ethical evaluation
   - **Input only**: Evaluate user input before AI processing
   - **Output only**: Evaluate AI-generated responses
   - **Both**: Evaluate both input and output (recommended)

2. **Action on Violation**: What to do when ethical violations are detected
   - **Stop processing**: Block content that violates ethical standards
   - **Request rewrite**: Ask for content modification to address violations
   - **Log warning and continue**: Allow content but log violations for review

3. **Overall Weighted Threshold**: Minimum weighted score required to pass (0-10 scale)
   - Higher values are more strict
   - Recommended: 7.0 for balanced protection

### Dimension Configuration

Each ethical dimension can be individually configured:

#### Fairness
- **Purpose**: Ensures content treats all groups equitably without bias or discrimination
- **Examples**: Gender-neutral job descriptions, culturally sensitive content, avoiding stereotypes
- **Recommended Weight**: 1.0-1.5
- **Recommended Threshold**: 7.0-8.0

#### Safety (Critical)
- **Purpose**: Prevents harmful, dangerous, or unsafe content
- **Examples**: Violence prevention, suicide prevention, avoiding dangerous instructions
- **Recommended Weight**: 1.5-2.0
- **Recommended Threshold**: 8.0-9.0
- **Note**: Always stops processing on violation

#### Reliability
- **Purpose**: Ensures content is accurate, consistent, and dependable
- **Examples**: Fact-checked information, consistent messaging, credible sources
- **Recommended Weight**: 1.0-1.5
- **Recommended Threshold**: 7.0-8.0

#### Transparency
- **Purpose**: Promotes clear, understandable, and explainable content
- **Examples**: Clear explanations, citing sources, acknowledging uncertainty
- **Recommended Weight**: 1.0
- **Recommended Threshold**: 6.5-7.5

#### Privacy (Critical)
- **Purpose**: Protects personal information and user privacy
- **Examples**: Avoiding personal data collection, respecting user consent, data anonymization
- **Recommended Weight**: 1.5-2.0
- **Recommended Threshold**: 8.0-9.0
- **Note**: Always stops processing on violation

#### Accountability
- **Purpose**: Ensures content supports responsible and traceable AI decisions
- **Examples**: Clear attribution, taking responsibility for claims, audit trails
- **Recommended Weight**: 1.0
- **Recommended Threshold**: 6.5-7.5

#### Inclusivity
- **Purpose**: Promotes content that is accessible and inclusive to all users
- **Examples**: Accessible language, diverse representation, inclusive design
- **Recommended Weight**: 1.0-1.5
- **Recommended Threshold**: 7.0-8.0

#### User Impact
- **Purpose**: Considers the broader impact of content on users and society
- **Examples**: Promoting wellbeing, considering social impact, avoiding manipulation
- **Recommended Weight**: 1.0
- **Recommended Threshold**: 6.5-7.5

## Configuration Presets

### Strict Mode
- **Use Case**: High-security environments, sensitive content, public-facing applications
- **Overall Threshold**: 8.5
- **Characteristics**: All dimensions enabled with high thresholds and increased weights for critical dimensions
- **Action**: Stop processing on violations

### Balanced Mode (Recommended)
- **Use Case**: General-purpose applications, balanced protection and usability
- **Overall Threshold**: 7.0
- **Characteristics**: All dimensions enabled with moderate thresholds and standard weights
- **Action**: Stop processing on violations

### Permissive Mode
- **Use Case**: Internal tools, development environments, creative applications
- **Overall Threshold**: 5.5
- **Characteristics**: Only essential dimensions enabled with lower thresholds
- **Action**: Log warnings and continue

## Best Practices

### Getting Started
1. Start with the **Balanced Mode** preset for most use cases
2. Enable at least **Safety** and **Privacy** dimensions for basic protection
3. Test with sample content to understand scoring behavior
4. Adjust thresholds based on your specific requirements

### Weight Configuration
- Use higher weights (1.5-2.0) for dimensions most important to your use case
- Keep critical dimensions (Safety, Privacy) at higher weights
- Balance total weights to avoid over-emphasizing any single dimension

### Threshold Setting
- Start with recommended thresholds and adjust based on testing
- Higher thresholds (8.0+) provide stricter evaluation
- Lower thresholds (6.0-7.0) are more permissive
- Monitor violation rates and adjust accordingly

### Performance Optimization
- Enable only dimensions relevant to your content type
- Use appropriate timeout values (60 seconds recommended)
- Monitor API usage and response times
- Consider caching for repeated evaluations

## Troubleshooting

### Common Issues

#### "API endpoint URL is required"
- **Cause**: Missing or invalid API endpoint URL
- **Solution**: Enter a valid HTTPS URL provided by RAIL support
- **Example**: `https://api.rail.com/v1/evaluate`

#### "API key appears to be too short"
- **Cause**: Invalid or incomplete API key
- **Solution**: Copy the complete API key from your RAIL account dashboard
- **Note**: API keys should be at least 10 characters long

#### "No dimensions enabled"
- **Cause**: All ethical dimensions are disabled
- **Solution**: Enable at least one dimension, preferably Safety and Privacy
- **Recommendation**: Use a configuration preset for quick setup

#### "Connection test failed"
- **Cause**: Network issues, invalid credentials, or API unavailability
- **Solutions**:
  1. Check internet connection
  2. Verify API endpoint URL and key
  3. Contact RAIL support if issues persist
  4. Check firewall settings for HTTPS access

#### "Evaluation taking too long"
- **Cause**: API timeout or slow response
- **Solutions**:
  1. Increase timeout value (up to 300 seconds)
  2. Reduce number of enabled dimensions
  3. Check API status with RAIL support

### Error Messages

#### "RAIL API authentication issue detected"
- **Meaning**: API key is invalid or expired
- **Action**: Update API key in configuration

#### "RAIL API temporarily unavailable"
- **Meaning**: API service is down or unreachable
- **Action**: System will fail-open (allow content to proceed)
- **Note**: Check RAIL service status

#### "Dimension score validation failed"
- **Meaning**: API returned invalid or incomplete scores
- **Action**: System will fail-open and log the issue
- **Note**: Contact RAIL support if persistent

## Integration Examples

### Basic Setup
```yaml
# Minimal configuration for basic protection
enabled_dimensions:
  - safety
  - privacy
dimension_weights:
  safety: 2.0
  privacy: 2.0
overall_threshold: 7.0
action_on_violation: stop
```

### Content Moderation
```yaml
# Configuration for content moderation platform
enabled_dimensions:
  - fairness
  - safety
  - privacy
  - inclusivity
dimension_weights:
  fairness: 1.5
  safety: 2.0
  privacy: 2.0
  inclusivity: 1.5
overall_threshold: 7.5
action_on_violation: stop
```

### Creative Writing Assistant
```yaml
# Configuration for creative writing tools
enabled_dimensions:
  - safety
  - privacy
  - user_impact
dimension_weights:
  safety: 1.5
  privacy: 1.5
  user_impact: 1.0
overall_threshold: 6.0
action_on_violation: warn
```

## Support and Resources

### Getting Help
- **Module Issues**: Create an issue in the Drupal project queue
- **RAIL API Issues**: Contact RAIL support directly
- **Configuration Questions**: Consult this documentation or community forums

### Additional Resources
- [RAIL Official Documentation](https://rail.com/docs)
- [Drupal AI Module Documentation](https://drupal.org/project/ai)
- [Ethical AI Best Practices](https://rail.com/ethics)

### Version Information
- **Module Version**: Check your installed version in the module list
- **API Version**: Displayed in configuration form after successful connection test
- **Compatibility**: Requires Drupal AI module 2.x or higher

## Changelog

### Recent Improvements
- Enhanced configuration UI with tooltips and examples
- Real-time validation and recommendations
- Configuration presets for quick setup
- Improved error messages and troubleshooting guidance
- Better performance monitoring and logging

### Upcoming Features
- Advanced caching mechanisms
- Batch evaluation capabilities
- Custom dimension definitions
- Integration with content workflows