Setting Up the BDDBot

After you have downloaded the latest BDDBot distribution from the BDDBot home page, the first thing you will need to do is change the email address of the administrative contact. Open the file "EnginePrefs.java" (in the bdd/search subdirectory) and find the line that reads


String email_address = "nobody@nowhere.edu"; // administrator's email address

and change nobody@nowhere.edu to your email address.

Now you need to recompile the "EnginePrefs.java" file so that the change takes effect. You can recompile it using the following command in UNIX/Linux


	javac bdd/search/EnginePrefs.java

or the following command in Windows

	javac bdd\search\EnginePrefs.java

Now you need to tell the BDDBot where it should crawl. Look in the "searchdb" directory that was created when you unzipped the original file and open up the file called "urls.txt". This file is where you should list all of the starting URLs for the crawler. You should list each URL on a line by itself. A line beginning with a pound sign (#) is considered a comment and is ignored. Blank lines are also ignored. The BDDBot will recursively crawl the URLs listed here.

The other file that you will need to edit is the "rules.txt" file. This file specifies which URLs the robot should and shouldn't crawl. A line that is in the form


include http://gsd.mit.edu/

will cause all URLs that start with "http://gsd.mit.edu/" to be included. Similarly, to exclude URLs, use the keyword "exclude" instead of "include". Blank lines and lines starting with "#" are ignored.

When an URL is checked against the inclusion/exclusion rules the exclusion rules are checked first and if the URL matches an exclusion rule it is not included. If an URL is not covered by either rule it is not included, unless it is a "file://" URL in which case it is included by default.