TO INSTALL and TEST CLAIRLIB-CORE: - Clairlib-core requires Perl 5.8.2 or greater. Confirm that at least this is the version you are running by running "perl -v". - Install MEAD 3.11 or higher. MEAD, as well as instructions for installation can be obtained from http://www.summarization.com. This will also install lexrank and tf2gen, which needs to be pointed to by \$GENPROB (see below). - Install any of the CPAN PERL modules listed below that you do not have: - BerkeleyDB - Carp - File::Type - Getopt::Long - Graph::Directed - HTML::LinkExtractor - HTML::Parse - IO::File - IO::Handle - IO::Pipe - Lingua::Stem - Math::MatrixReal - Math::Random - MLDBM - POSIX - Scalar::Util - Statistics::ChisqIndep - Storable - Test::More - Text::Sentence - XML::Parser - XML::Simple - Unzip/untar the distribution you downloaded (e.g. clairlib-core.tar). - cd to the top directory of the download. - Edit lib/Clair/Config.pm. You will see the message: ################################# # For Clairlib-core users: # 1. Edit the value assigned to $CLAIRLIB_HOME and give it the value of the path to your installation. # 2. Edit the value assigned to $MEAD_HOME and give it the value that points to your installation of MEAD. # 3. Edit the value assigned to $EMAIL and give it an appropriate value. Follow these instructions. It is probably best to keep the values assigned to the variables: $CIDR_HOME = "$MEAD_HOME/bin/addons/cidr"; $PRMAIN = "$MEAD_HOME/bin/feature-scripts/lexrank/prmain"; $DBM_HOME = "$MEAD_HOME/etc"; Those variables reflect the standard structure of your MEAD installation. - Edit Statistics/ChisqIndep.pm (a required module for Clairlib) by adding the following function: sub recompute_chisq { my $self = shift; $self-> {p_value} = chisqrprob($self->{df}, $self->{chisq_statistic}); return 1; } - cd again to the top directory of the download. - To test the clairlib-core modules, run the following commands: perl Makefile.PL make make test - If you would like to have the clairlib-core modules installed for you, and you have permission to do so, you can install them by running the command: make install If you have only local permissions, but you have a personal perl library at $MY_PERL_LIB, then you can have clairlib-core installed there by running the commands: perl Makefile.PL -PREFIX=$MY_PERL_LIB make install - To use the clairlib-core modules in a script, include the path CLAIRLIB_HOME/lib in the @INC variable (either through the PERLLIB (or PERL5LIB) environment variable or by including a "use lib ..." at the top of the script). TO INSTALL AND TEST CLAIRLIB-EXT: - Optional extensions to clairlib-core, as well as functionality that depends on other software, has been put into the clairlib-ext distribution. To fully utilize clairlib-ext, follow these steps. - Install Adwait Ratnaparkhi's MxTerminator from http://home.comcast.net/~adwaitr/jmx.tar.gz. - If you have a Charniak parser available, clairlib-ext can make use of it. - Install the following modules from CPAN: - IPC::Open2 - Net::Google - Install the Weka toolkit, a Java machine learning library, from http://www.cs.waikato.ac.nz/ml/weka/ - Edit lib/Clair/Config.pm, where you have installed clairlib-core. You will see this message and (commented-out) assignments. Any of these lines may be safely left uncommented if you do not have the resources available. ################################# # Only users who have installed Clairlib-ext may need to change the following # commented-out assignments. # # For Clairlib-ext users: If you wish to use MxTerminator, uncomment the following lines. Point $JMX_HOME to your installation of MxTerminator, and point $JMX_MODEL_PATH to the location of your MxTerminator trained data. #$JMX_HOME = "/data2/tools/jmx_new"; #$SENTENCE_SEGMENTER_TYPE = "MxTerminator"; #$JMX_MODEL_PATH = "/data2/tools/jmx_new/eos.project"; If you have MySQL installed and wish to use ALE, point $ALE PORT at your MySQL socket. Also uncomment the subsequent two definitions and provide the root password to your MySQL installation. # $ALE_PORT = "/tmp/mysql.sock"; # $ALE_DB_USER = "root"; # $ALE_DB_PASS = ""; For the WebSearch module, put a Google Search API key here. Unfortunately Google does not give out keys anymore and has moved to an AJAX Search API. If you have a SOAP API key, you can still use it and WebSearch will still work. # Please see the clairlib webpage for help obtaining a Google key #$GOOGLE_DEFAULT_KEY = ""; If you have a Charniak parser available, point these variables to it: # Default parser and data paths for the Charniak parser for use in Parse.pm # (Note that CHARNIAK_DATA should end with a slash and that the other # paths include the executable) #$CHARNIAK_PATH = "/data0/tools/charniak/PARSE/parseIt"; #$CHARNIAK_DATA_PATH = "/data0/tools/charniak/DATA/EN/"; # Default path to Chunklink #$CHUNKLINK_PATH = "/data2/tools/chunklink/chunklink.pl"; If you have installed the Weka machine learning toolkit, uncomment this line and point this variable to the location of the Weka jar file: #$WEKA_JAR_PATH = "/data0/users/mjschal/weka/weka-3-4-10/weka.jar"; - Unzip/untar your downloaded version of Clairlib-ext. Then cd to the top directory of the download. - To test the clairlib-ext modules, you must first have installed the clairlib-core modules. Confirm that you have and then run the following commands: perl Makefile.PL make make test - If you would like to have the clairlib-ext modules installed for you, and you have permission to do so, you can install them by running the command: make install If you have only local permissions, but you have a personal perl library at $MY_PERL_LIB, then you can have clairlib-ext installed there by running the commands: perl Makefile.PL -PREFIX=$MY_PERL_LIB make install - To use the clairlib-ext modules in a script, include the path (your clairlib-ext directory here)/lib in the @INC variable (either through the PERLLIB (or PERL5LIB) environment variable or by including a "use lib ..." at the top of the script). KNOWN ISSUES At this time there are some documentation issues that we intend to address in a future release of Clairlib. The section "Structure of Code", in the PDF tutorial, does not offer an exhaustive list of the modules in the Clair::* namespace, though it does cover the most important, core modules. A complete overview is something we're working on, to be released soon. Relatedly, many of the Clairlib modules themselves use different formats for documentation. We will be striving to unify the format of all Clairlib module documentation, and this will be finished in a future release. SUPPORT AND DOCUMENTATION After installing, you can find documentation for this module with the perldoc command, e.g. perldoc Clair::Document Each distribution also includes a PDF tutorial. Online API documentation can be found at the Clairlib homepage at http://tangra.si.umich.edu/clair/clairlib. COPYRIGHT AND LICENCE Copyright (C) 2000-2007 The Clair group This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This work has been supported in part by National Institutes of Health grants R01 LM008106 "Representing and Acquiring Knowledge of Genome Regulation" and U54 DA021519 "National center for integrative bioinformatics", as well as by grants IDM 0329043 "Probabilistic and link-based Methods for Exploiting Very Large Textual Repositories," DHB 0527513 "The Dynamics of Political Representation and Political Rhetoric," 0534323 "Collaborative Research: BlogoCenter - Infrastructure for Collecting, Mining and Accessing Blogs," and 0527513 "The Dynamics of Political Representation and Political Rhetoric," from the National Science Foundation.