2.1 Greenstone
Greenstone (New Zealand Digital Library Project,
2008; Witten et al., 2009) is a software suite for cre-
ating custom digital library collections and making
them available locally or via the Internet. A digital
library collection contains documents of varying for-
mats, including images, PDF files, Word documents,
etc. A Greenstone collection can be customized in
many ways. For example, a user can specify the ap-
pearance of the interface to their collection, how the
collection will be accessed, and the types of docu-
ments that the collection can contain.
A greenstone collection can be maintained in two
ways. The first is by using the import and build com-
mands that are executed from a command prompt.
The second is by using the Greenstone Librarian In-
terface (GLI) (Witten, 2004), which allows users who
are more comfortable with using a graphical user in-
terface to have full access to the same functionality.
2.2 Cron
Cron (Nemeth et al., 2007; Vixie, 1994; Schapira,
2004) is a program for users to schedule tasks that
will run automatically at a specified time. A task can
be one command, or a script containing several com-
mands that are executed in sequence. Initially, Cron
was implemented for Unix and Linux platforms, with
most systems running Vixie Cron (Vixie, 1994). Mac
OS X also runs Vixie Cron. In addition, versions of
Cron now exist for Windows platforms, such as Py-
cron (Schapira, 2004). We summarize the general
ideas behind all implementations of Cron that are ap-
plicable to our work.
Every minute, Cron reads task configuration files,
called crontab files, that contain a record for every
task that is scheduled for execution at a specific time.
The format of a crontab record is (min hr dom moy
dow user task), where min, hr, dom, moy, and dow
are the minute, hour, day of month, month of year
and day of week, respectively, user is the username
that the command will run under, and task is the com-
mand or script that is executed at the specified time.
Two types of crontab include system crontab and
user crontab. A system crontab file is mainly for sys-
tem administration and maintenance tasks. Also, even
if all tasks have a specified low-level username, root
privileges are required for modifying a system crontab
file. A user crontab file can be set up by any user
on the system to execute tasks under their own user-
name. Assuming the user has permission to execute
the task, no root permissions are required. Therefore,
the Greenstone scheduler uses a user crontab file.
30 * * * * /collect/pics/gsdl.pl
59 23 * * * /usr/bin/cleanup.bash
00 6 * * 7 /home/someuser/alarm
00 0 1 1 * echo "Happy New Year!"
Figure 1: Sample Crontab File.
#!/usr/bin/perl
$ENV{’GSDLHOME’}="/home2/gsdl/gsdl";
$ENV{’GSDLOS’}="linux";
$ENV{’GSDLLANG’}="";
$ENV{’PATH’}="/bin:/usr/local/bin:
/usr/bin:/usr/local/gsdl/bin/script:
/usr/local/gsdl/bin/linux";
system("import.pl -removeold pics");
system("buildcol.pl -removeold pics");
system("\\rm -r /collect/pics/index/*");
system("mv /collect/pics/building/*
gsdl/collect/pics/index/");
system("chmod -R 755 /collect/pics/index/*");
00 0 * * * /gsdl/collect/pics/cron.pl
Figure 2: Sample Building Script and Crontab Record.
Figure 1 displays a sample user crontab file
with 4 tasks that are scheduled for specific times.
The first task, /collect/pics/gsdl.pl, is scheduled to
be run at 30 minutes past every hour. The sec-
ond task, /usr/bin/cleanup.bash, is scheduled for
execution daily at 11:59pm. The third task,
home/someuser/alarm, is scheduled every Sunday at
6:00am. Finally, the fourth task, which echoes
”Happy New Year!”, is schedule for execution every
January 1st at 12:00am.
2.3 The Scheduler
The Scheduler (Osborn and Fox, 2007; Osborn et al.,
2008) is a command-line program that is part of the
Greenstone software suite. The Scheduler creates a
building script for a specific collection and also inter-
acts with Cron on the operating system. The original
version of the Scheduler takes as command-line argu-
ments the name of the collection, the import and build
commands (and all of their required arguments), and
the frequency of execution (hourly, daily or weekly).
The output from the Scheduler is a customized Perl
script that rebuilds the collection, and modifications
to the Cron scheduling service to execute the script at
the frequency specified.
For example, suppose we want to schedule a daily
build of the collection pics. A call to the Scheduler
would resemble the following:
schedule.pl
-import "import.pl -removeold pics"
ICEIS 2011 - 13th International Conference on Enterprise Information Systems
124