WARNING: When set up correctly with a firewall or similar this module will slow down and block traffic to your website. This potentially could include yourself, website owners, maintainers and legitimate visitors to your website. If you use this module then you do so at your own risk.
The Honeytrap module allows site owners and system administrators to monitor web crawlers that do not follow the rules set out in the robot.txt file or via the RobotsText module or similar method and as a result put an unnecessarily high load on servers.
This is especially important for large, very high traffic and/or high profile sites where the activity of these non-compliant crawlers can bring servers to their knees. If these crawlers are not blocked or slowed down quickly enough then these crawlers can result in servers being knocked completely offline.
The Honeytrap module does not directly block or slow down offending IP addresses itself; it only logs and reports them, leaving you in full control of how you want to deal with them.
Install as usual, see drupal.org/node/70151 for further information.
To make things as clear as possible I have included a list of common terms that are used in this guide.
Configure the following Honeytrap user permissions in Administration » People » Permissions » Honeytrap module:
edit honeytraps - Users in roles with the "edit Honeytraps" permission will be able to see and alter the Honeytrap settings via Administration » Site configuration » Honeytrap
view honeytrap lists - Users with this permission can view the Honeytrap lists via dedicated urls of the format Honeytrap/list/list_name (e.g. the black list can be seen at Honeytrap/list/black). Users in roles with the "Access administration menu" permission may also view the Honeytrap lists on the admin pages under the "View lists" tab.
If configured, ensure that your web service has read and write access to the location of the list files:
List files - The list files can either be read from urls of the format honeytrap/lists/list_name or from the files which contain the lists. These files allow other processes such as firewalls to be instantly updated. If you enable the list file functionality then you must ensure that your web service has write access to these folders.
If you don't have the RobotsTxt module enabled then ensure that you add the correct entries to your robots.txt module as indicated on the settings page of the Honeytrap module.
To alter the output format of the Honeytrap lists you will need to define your own honeytrap_list_item theme.
Lots of other settings can be altered on the administration settings tab for the Honeytrap module. This tab can be found at admin/settings/honeytrap/settings.
The Honeytrap module makes use of three lists. For automatic and optimal site performance these lists should be used in conjunction with a firewall or similar system as follows:
naughty list - Firewalls should throttle IP addresses on this list to slow down their access to your site. Items that trigger a trap are added to the naughty list unless they are on the black or white list.
black list - Firewalls should block IP addresses on this list. You can manually add or edit IP addresses to put them on the black list.
white list - Firewalls should ignore this list. You can manually add or edit IP addresses to put them on the white list. The white list is used to prevent the specified IP addresses from appearing on any other list. You should normally ensure that you add your own IP address to this list.
Below is a list of common problems that you may encounter. Each problem is followed by a list of likely causes which should help you to resolve the problem.
A: The easiest way to test your traps is simply by visiting a trap in the browser. You will find an example URL for a trap on the Settings tab. Once you are happy with this you can carry out more "real life" tests using something like "wget" or by downloading a web crawler that will ignore the robot.txt file.
WARNING: Be careful that you don't block yourself, especially if you have a firewall fully connected up to the Honeytrap. This is even more important if you only have access to your site via a single IP address. I suggest that you add your IP address to the white list first before carrying out any of these tests. You will be able to monitor hits to the trap urls via the watchog log even if your IP address is on the white list.
A: Have you set up your firewall to read in the list files created by the Honeytrap? If so, check that the files exist where you expect them and that they contain the anticipated IP addresses. If they do, then check your firewall setup is correct and that it is actually reading the files. Also check that the list file format is what your firewall expects.
Mike Jessop a.k.a. Mikey Bunny (don't ask)
Moo Free Chocolates
www.moofreechocolates.com
Manufacturer of scrummy tasting dairy free chocolates.