An Information Retrieval using Weighted Index Terms in Natural Language Document Collections

Abstract:

Indexing a document is the method for describing its content for sake of easier subsequent retrieval in a document storage. This paper describes the implementation of the automatic indexing of various term weighting schemes in an IR (Information Retrieval) system using CISI  documents collection which constitutes of abstracts for information retrieval papers and NPL collection which constitutes of abstracts for electronic engineering documents.

The system starts with a simple form of text representation in which extracts keywords that represent documents as vectors of weights that represent the importance of keywords in documents of the documents collection and then evaluates, compares the retrieval effectiveness of various search models based on automatic text-word indexing and presents experimental results conduct to study the improvements made on the effectiveness of the text retrieval by successively applying these approaches.