Plugins are special modules allowing to increase programme functionality. CNSearch uses plugins to index files of different types.
Plugins should be stored in the indexer directory. UNIX and Linux versions have an extension .so, Windows versions - .dll. If you need to disconnect some plugin, you may simply move it to other directory.
Three plugins are included in current distribution. They allow to index files of types:
Name of file in UNIX/Linux version | Name of file in Windows version | Type of the processed document |
---|---|---|
libtxt.so | libtxt.dll | *.TXT - text files |
librtf.so | librtf.dll | *.RTF - Rich text format files |
libdoc.so | libdoc.dll | *.DOC - Microsoft Word files |
libxls.so | libxls.dll | *.XLS - Microsoft Excel files |
libmp3.so | libmp3.dll | *.MP3 - MPEG Layer 3 audio files |
Plugins in version 0.92 do not attempt to define character sets, because it is not necessary for most files.
The field 'encoding' in documents processed by plugins is replaced by the text set in plugin, which allows composing templates where type of the found document is displayed.
During start up the indexer lists all active plugins, for example:
F:\1\bin\indexer>searchctl.exe localhost CNSearch ver.0.92 [build 2073] Compiled 07.04.2002 under MS Windows 2000 [Version 5.00.2195] Rebuilding URL list...Ok. Loading library: RTF (Rich text format) Loading library: TXT (Plain text) Loading library: DOC (Microsoft Word document format) http://www.test.ru/
The main benefit of plugins is possibility to develop your own plugins to index files of some specific formats. For example, it is possible to create a plugin to search in images and so on.
There is a file 'plugin.zip' in the directory '/manual' of the distribution. It contains the source code of text files processing plugin.
To be found by system plugin must have a correct extension, be stored in the same directory with the indexer and contain the following functions:
Name of function | Function description |
---|---|
char *get_info(void) | Returns a string - information about plugin (its name) |
char *get_mime(void) | Returns a string - list of MIME TYPEs, which are processed by this plugin, separated by vertical line "|" |
char* get_shortdesc(void) | Returns a string - short name of a file type (placed where HTML files have character set description) |
char* get_range(void) | Возвразает строку - поле Range для HTTP заголовка (см RFC2068), если поле "Range" не используется, то функция должна вернуть NULL. |
char* get_title(void) | Returns a string - document title. If NULL, URL of the document is displayed |
TPluginWord* get_word(unsigned char *d, unsigned long filesize) | Main function - returns pointer to 'TpluginWord' structure, containing a word which should be added to the search index. This function must return words contained in a document in series.
Main function - returns pointer to 'TpluginWord' structure, containing a word which should be added to the search index. This function must return words contained in a document in series.
|
TpluginWord structure looks as follows:
typedef struct { char word[32]; int rel; bool end; } TPluginWord;
where
Methods used by programme to generate plugin functions.
Functions get_info(), get_mime(), and get_shortdesc() are called once, when a plugin is loaded. Function get_title() is called once for each document, afterwards get_word() function is called for respective documents until 'end' field of TwordPlugin structure becomes 'true'.
That's all about plugins, if any suggestions arise feel free to contact us.