Search/Context_Graph version 0.01
=================================

This module is an implementation of a search technique called 'contextual
network search', or 'spreading activation search'.

The idea is to represent a document collection as a set of document and term
nodes in a bipartite graph.  How you generate the term list is up to you - our
own approach is to extract all nouns and noun phrases using a part-of-speech
tagger (see L<Lingua::EN::Tagger>).

Documents and terms are connected by weighted edges.  Weights on the edges are a
function of your choice of weighting algorithm.   The only restriction is that
weights must not exceed 1.

We search the graph by energizing a query node with an arbitrary starting energy
E.  We then distribute that energy among the neighbor nodes, according to the
following formula.  First, divide the energy by the number of neighbor nodes -
call this new value S.   For example, if the starting energy is 10,000, and our
node has five neighbors, S = 2000 units.   Next, determine whether S exceeds our
arbitrary threshold.    If S is less than the threshold, we stop propagating.
If S exceeds the threshold, we assign energies to all the neighbor nodes, and
recurse down.

The energy assigned to each neighbor node will depend on the weight of the edge
connecting it to the starting node.  Since this weight is guaranteed to fall
between 0 and 1, the maximum energy a neighbor node can receive is S.


INPUT FORMAT

The module can take either a hash of document titles and term lists, or a term-
document matrix (TDM) file.  The first format looks like this

	{ 
		TITLE => { 
					WORD => COUNT,
			   		WORD => COUNT,
			   		...
			  	 },
		...
	}
	
The TDM input format is useful for very large collections.  The TDM file is a plain text file with the following format:

Arbitrary text
Arbitrary text
TERMS DOCS
Arbitrary text
A B C B C B C
....

The first two lines


INSTALLATION

To install this module type the following:

   perl Makefile.PL
   make
   make test
   make install

COPYRIGHT AND LICENCE

Copyright (C) 2003 Maciej Ceglowski, John Cuadrado, NITLE

This software may be distributed under the same terms as Perl itself.

