fpc/packages/fpindexer
pierre e1b53067a6 Merge commit 48814:
------------------------------------------------------------------------
r48814 | pierre | 2021-02-26 17:15:30 +0000 (Fri, 26 Feb 2021) | 1 line

 Do not add libgcc directory to library directories with -Fl option if -Xd option is used
------------------------------------------------------------------------

------------------------------------------------------------------------
r42843 | pierre | 2019-08-26 21:41:41 +0000 (Mon, 26 Aug 2019) | 1 line

 Partial update for go32v2 and sources
------------------------------------------------------------------------
--- Merging r44112 into '.':
G    installer/install.dat
--- Recording mergeinfo for merge of r44112 into '.':
 G   .
------------------------------------------------------------------------
r48836 | hajny | 2021-02-28 02:08:17 +0000 (Sun, 28 Feb 2021) | 1 line

  * package tplylib added, corrections for more than 31 items on a tab
------------------------------------------------------------------------
--- Merging r48836 into '.':
G    installer/install.dat
--- Recording mergeinfo for merge of r48836 into '.':
 G   .
------------------------------------------------------------------------
r49035 | pierre | 2021-03-22 23:26:26 +0000 (Mon, 22 Mar 2021) | 1 line

 Try to fix short package name issues in install.dat and add missing go32v2 packages
------------------------------------------------------------------------
--- Merging r49035 into '.':
G    installer/install.dat
--- Recording mergeinfo for merge of r49035 into '.':
 G   .

git-svn-id: branches/fixes_3_2@49041 -
2021-03-23 23:59:37 +00:00
..
examples * HTML search database example 2018-07-08 19:46:15 +00:00
src * Support for available words search 2018-07-10 07:27:44 +00:00
fpmake.pp Merge commit 48814: 2021-03-23 23:59:37 +00:00
Makefile Regenerate all Makefile's after ios introduction and macos->macosclassic changes inside utils/fpcm/fpcmake.ini 2020-09-23 09:47:20 +00:00
Makefile.fpc * fixes to 3.2.1 2020-06-20 16:47:24 +00:00
README.txt

This directory contains an implementation of an indexing and search
mechanism.

Architecture:
=============

The indexer and search mechanism design is modular:

  - A storage mechanism
  - An indexer class
  - A search class
  - Text processing classes.

The indexer uses a text processing class and a storage mechanism to create a
search database. The search class uses the same storage mechanism to search
the database. 

Currently, 3 databases are supported:
  - In memory database (plus flat file storage)
  - Firebird database
  - sqlite database.

3 input text processors are supported:
   - Plain text
   - HTML
   - Pas files.
A text processor is selected based on the extension of a file, if a file is
processed.

It is possible to specify a list of words to ignore per language, and a mask for words to
ignore.

On top of the file/stream indexer, a database indexer is implemented.
It can be used to implement full-text search on a database.

Sample programs for all 3 classes (search, index and index DB) are provided
in the examples dir.

Overview of units:
==================
fpindexer:
  The indexer, search and abstract database engine classes. 
  An abstract SQL storage engine class.

ireaderhtml  
  an input engine for HTML files.

ireaderpas  
  an input engine for pascal files.

ireadertxt
  an input engine for plain text files.

masks  
  Copied from the LCL, to implement masks on words.

memindexdb  
  A memory storage engine.

sqldbindexdb  
  An abstract SQLDB storage engine.

fbindexdb  
  A descendent of the SQLDB storage engine which uses a firebird database.

sqliteindexdb
  SQLite database storage engine.

dbindexdb
  Component to index a database.

Overview of classes:
====================

fpindexer:
----------
TFPIndexer: The indexing engine.

TCustomFileReader: abstract input engine.
TFileHandlersManager: factory for file reader classes.
TIgnoreListDef: Word ignore list definition.
TIgnoreLists: Collection of ignore lists

TFPSearch: the search engine.

TCustomIndexDB: Abstract storage engine.
TSQLIndexDB: Abstract SQL-Based storage engine.

ireaderhtml: 
------------

 TIReaderHTML:  HTML input engine.

ireaderpas:
-----------

 TIReaderPAS: pascal input engine.

ireadertxt:  
-----------

  TIReaderTXT: plain text input engine.

memindexdb:
-----------
  TMemIndexDB: In memory storage engine
  TFileIndexDB: Descendent of TMemIndexDB which stores everything in a flat
file using a custom format.

sqldbindexdb:  
-------------
  TSQLDBIndexDB: Abstract class for SQLDB-based storage (descendent of TSQLIndexDB)

sqliteindexdb:
--------------  
  TSQLiteIndexDB: SQLIte based storage engine, descendent of TSQLIndexDB

fbindexdb:  
----------
  TFBIndexDB TSQLDBIndexDB descendent for Firebird.

dbindexdb:
----------
  TDBIndexer: Implements a database indexer, using a second database as the index.
  TIBIndexer: Descendent of TDBIndexer for firebird.